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AUDIO-VIDEO TRANSMITTER AND AUDIO-VIDEO RECEIVER, DATA- 
PROCESSING APPARATUS AND METHOD, WAVEFORM- DATA- 
TRANSMITTING METHOD AND APPARATUS AND WAVEFORM-DATA- 
RECEIVING METHOD AND APPARATUS, AND VIDEO-TRANSMITTING 
METHOD AND APPARATUS AND VIDEO-RECEIVING METHOD AND 
APPARATUS 

Technical Field 

The present invention relates to audio-video 
transmitter and audio-video receiver, data-processing 
apparatus and method, waveform-data-transmitting method 
and apparatus and waveform-data-receiving method and 
apparatus , and video-transmitting method and apparatus and 
video-receiving method and apparatus. 

Background Art 

There has been an apparatus which satisfies the sense 
of real existence that a counterpart is present in front 
of you and aims at realistic picture communication by 
extracting, for example, a person's picture out of the 
scenery picture of a space in which you are present and 
superimposing the person's picture, a person's picture sent 
from the counterpart, and the picture of a virtual space 
to be displayed commonly with a previously-stored 
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counterpart on each other and displaying them (Japanese 
Patent Publication No, 4-24914). 

Particularly, in the case of the prior art, inventions 
concerned with acceleration for performing picture 
synthesis and a method for reducing memories are made (e.g. 
Official gazette of Japanese Patent Publication No. 5- 
46592: Picture synthesizer). 

Though a communication system using picture synthesis 
for synthesizing two-dimensional static pictures or 
three-dimensional CG data has been proposed by the prior 
art, specific discussion on a method for realizing a system 
for simultaneously synthesizing a plurality of 
video (picture) and a plurality of audio and displaying them 
has not been performed from the following viewpoints . 

That is, there has been a problem that no specific 
discussion has been performed from the following 
viewpoints : 

(Al) a method for transmitting (communicating and 
broadcasting) and controlling pictures and audio under the 
environment in which data and control information 
(information transmitted by a packet different from that 
of data to control the processing of terminal side) are 
independently transmitted by using a plurality of logical 
transmission lines constructed by software on one real 
transmission line or more; 

(A2) a method for dynamically changing header 
information (corresponding to data control information of 
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the present invention) to be added to data for a picture 
or audio to be transmitted; 

(A3) a method for dynamically changing header 
information (corresponding to transmission control 
information of the present invention) to be added for 
transmission ; 

(A4) a method for transmitting information by 
dynamically multiplexing and separating a plurality of 
logical transmission lines ; 

(A5) a method for transmitting pictures and audio 
considering the read and rise periods of program or data; 
and 

(A6) a method for transmitting pictures and audio 
considering zapping. 

However , the method for changing encoding systems and 
a method of discussing data in frames in accordance with 
the frame type of a picture have been proposed so far as 
a method for dynamically adjusting the amount of data to 
be transmitted to a network (H. Jinzenji and T. Tajiri, A 
study of distributive-adaptive-type VOD system, D-81, 
System Society of Institute of Electronics , Information and 
Communication Engineers (IEICE) (1995)). 

A dynamic throughput scalable algorithm capable of 
providing a high-quality video under a restricted 
processing time is proposed as a method for adjusting 
throughput at the encoder side (T. Osako, Y. Yajima, H. 
Kodera, H. Watanabe, K. Shimamura: Encoding of software 
video using a dynamic throughput scalable algorithm, Thesis 
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Journal of IEICE, D-2 , Vol. 80-D-2, No. 2, pp. 444-458 
(1997) ) . 

Moreover , there is an MPEG1/MPEG2 system as an example 
of realizing synchronous reproduction of video and audio. 

(Bl) The conventional method for discussing a picture 
correspondingly to the frame type of the video has a problem 
that it is difficult to preponderantly reproduce an 
important scene cut synchronously with audio by handling 
a plurality of video streams or a plurality of audio streams 
and reflecting the intention of an editor because the 
grading of the information which can be handled is in a 
single stream. 

(B2) Moreover, it must be possible that a decoder 
decodes every supplied bit stream because it is a 
prerequisite that MPEG1/MPEG2 is realized by hardware. 
Therefore , it is a problem how to correspond to the case 
of exceeding the throughput of the decoder. 

Moreover, to transmit video, there have been some 
systems including a system such as H. 261 (ITU-T 
Recommendation H. 261-Video codec for audio-visual 
services at px 64) and they have been mounted by hardware. 
Therefore, the case has not occurred that decoding is not 
completed within a designated time because of considering 
the upper limit of a necessary performance when designing 
hardware . 

The above-designated time denotes a time required to 
transmit a bit stream obtained by coding a sheet of video. 
If decoding is not completed within the time, an extra time 
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becomes a delay. If the delay is accumulated, the delay 
from the transmitting side to the receiving side increases 
and the system cannot be used as a video telephone. This 
state must be avoided. 

Moreover, when decoding cannot be completed within a 
designated time because a communication counterpart 
generates an out-of-spec bit stream, a problem occurs that 
a video cannot be transmitted. 

The above problem occurs not only for a video but also 
for audio data. 

However, in recent years, because the network 
environment formed by personal computers (PCs) has been 
arranged as the result of spread of internet and ISDN, the 
transmission rate has been improved and it has been possible 
to transmit a video by using PCs and a network. Moreover, 
requests for transmission of video by users have been 
rapidly increased. Furthermore, a video can be completely 
decoded by software because CPU performances have been 
improved. 

However, because the same software can be executed by 
personal computers different in structure such as a CPU, 
bus width, or accelerator, it is difficult to previously 
consider the upper limit of a necessary performance and 
therefore, a problem occurs that a picture cannot be decoded 
within a designated time. 

Moreover, when coded data for a video having a length 
exceeding th throughput of a receiver is transmitted, 
coding cannot be completed within a designated time. 



- 6 - 



Problem (CI ) : Decreasing a delay by decoding a picture 
within a designated time* 

When inputting a video as the waveform data of claim 
CI of the present invention or outputting a video as the 
waveform data of claim C7 of the present invention as means 
for solving the problem 1 , a problem may be left that the 
substantial working efficiency of a transmission line is 
lowered because a part of a transmitted bit stream is not 
used. Moreover, there are some coding systems that 
generate a present decoded video in accordance with a last 
decoded picture (e.g. P picture). However , because the 
last decoded picture is not completely restored by the means 
for solving the problem 1, there is a problem that 
deterioration of the picture quality inf luentially 
increases as time passes. 

Problem (C2) : In the case of the means for solving the 
problem 1, the substantial working efficiency of a 
transmission line is lowered. Moreover, picture-quality 
deterioration is spread. 

Furthermore, in the case of mounting by software, the 
frame rate of a picture is determined by the time required 
for one-time coding. Therefore, when the frame rate 
designated by a user exceeds the throughput of a computer, 
it is impossible to correspond to the designation. 

Problem (C3) : When the frame rate designated by a user 
exceeds the throughput of a computer, it is impossible to 
correspond to the designation. 
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Disclosure of the Invention 

When considering the problems (Al) to (A6) of the first 
prior art, it is an object of the present invention to 
provide an audio-video transmitter and audio-video 
receiver and data-processing apparatus and method in order 
to solve at least any one of the problems. 

Moreover, when considering the problems (Bl) and (B2) 
of the second prior art, it is another object of the present 
invention to provide data-processing apparatus and method 
in order to solve at least one of the problems. 

Furthermore, when considering the problems (CI) to 
(C3) of the last prior art, it is still another object of 
the present invention to provide waveform-data-receiving 
method and apparatus and waveform-data-transmitting method 
and apparatus, and video-transmitting method and apparatus 
and video-receiving method and apparatus in order to solve 
at least one of the problems. 

The present invention according to claim 1 is an 
audio-video transmitting apparatus comprising 
transmitting means for transmitting the content concerned 
with a transmitting method and/or the structure of data to 
be transmitted or an identifier showing the content as 
transmission format information through a transmission 
line same as that of the data to be transmitted or a 
transmission line different from the data transmission 
line; wherein 

said data to be transmitted is video data and/or audio data. 
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The present invention according to claim 2 is the 
audio-video transmitting apparatus according to claim 1, 
wherein said transmission format information is included 
in at least one of data control information added to said 
data to control said data, transmission control information 
added to said data to transmit said data, and information 
for controlling the processing of the terminal side. 

The present invention according to claim 3 is the 
audio-video transmitting apparatus according to claim 2, 
wherein at least one of said data control information, 
transmission control information, and information for 
controlling the processing of said terminal side is 
dynamically changed. 

The present invention according to claim 4 is the 
audio-video transmitting apparatus according to claim 3, 
wherein said data is divided into a plurality of packets, 
and said data control information or said transmission 
control information is added not only to the head packet 
of said divided packets but also to a middle packet of them. 

The present invention according to claim 5 is the 
audio-video transmitting apparatus according to claim 1, 
wherein an identifier showing whether to use timing 
information concerned with said data as information showing 
the reproducing time of said data is included in said 
transmission format information. 

The present invention according to claim 6 is the 
audio-video transmitting apparatus according to claim 1, 
wherein said transmission format information is the 
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structural information of said data and a signal which is 
output from a receiving apparatus receiving the transmitted 
structural information of said data and which can be 
received is confirmed and thereafter, said transmitting 
means transmits corresponding data to said receiving 
apparatus . 

The present invention according to claim 7 is the 
audio-video transmitting apparatus according to claim 1, 
wherein said transmission format information include (1) 
an identifier for identifying a program or data to be used 
by a receiving apparatus later and (2) at least one of a 
flag, counter, and timer as information for knowing the 
point of time in which said program or data is used or the 
term of validity for using said program or data. 

The present invention according to claim 8 is the 
audio-video transmitting apparatus according to claim 7 , 
wherein said point of time in which said program or data 
is used is transmitted as transmission control information 
by using a transmission serial number for identifying a 
transmission sequence or as information to be transmitted 
by a packet different from that of data to control 
terminal-side processing. 

The present invention according to claim 9 is the 
audio-video transmitting apparatus according to claim 2 or 
3 , wherein storing means for storing a plurality of contents 
concerned with said transmitting method and/or said 
structure of data to be transmitted and a plurality of its 
identifiers are included, and said identifier is included 
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in at least one of said data control information, 
transmission control information , and information for 
controlling terminal-side processing as said transmission 
format information. 

The present invention according to claim 10 is the 
audio-video transmitting apparatus according to claim 2 or 
3 , wherein storing means for storing a plurality of contents 
concerned with said transmitting method and/or said 
structure of data to be transmitted are included, and 
said contents are included in at least one of said data 
control information, transmission control information, and 
information for controlling terminal-side processing as 
said transmission format information. 

The present invention according to claim 11 is the 
audio-video transmitting apparatus according to claim 1, 
2, or 3, wherein a default identifier showing whether to 
change the contents concerned with said transmitting method 
and/or structure of data to be transmitted is added. 

The present invention according to claim 12 is the 
audio-video transmitting apparatus according to claim 9, 
10, or 11, wherein said identifier or said default 
identifier is added to a predetermined fixed-length region 
of information to be transmitted or said predetermined 
position. 

The present invention according to claim 13 is an 
audio-video receiving apparatus comprising: receiving 
means for receiving said transmission format information 
transmitted from the audio-vid o transmitting apparatus of 



- 11 - 



any one of claims 1 to 12; and transmitted-inf ormation 
interpreting means for interpreting said received 
transmission-format information. 

The present invention according to claim 14 is the 
audio-video receiving apparatus according to claim 13 , 
wherein storing means for storing a plurality of contents 
concerned with said transmitting method and/or said 
structure of data to be transmitted and a plurality of its 
identifiers are included , and the contents stored in said 
storing means are used to interpret said transmission 
format information. 

The present invention according to claim 15 is. an 
audio-video transmitting apparatus comprising: 
information multiplexing means for controlling start and 
end of multiplexing the information for a plurality of 
logical transmission lines for transmitting data and/or 
control information is included; wherein, not only said 
data and/or control information multiplexed by said 
information multiplexing means but also control contents 
concerned with start and end of said multiplexing by said 
information multiplexing means are transmitted as 
multiplexing control information, and said data includes 
video data and/or audio data. 

The present invention according to claim 16 is the 
audio-video transmitting apparatus according to claim 15 , 
wherein it is possible to select whether to transmit said 
multiplexing control information by arranging said 
information without multiplexing it before said data and/or 
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control information or transmit said multiplexing control 
information through a transmission line different from the 
transmission line for transmitting said data and/or control 
information . 

The present invention according to claim 17 is an 
audio-video receiving apparatus comprising: receiving 
means for receiving said multiplexing control information 
transmitted from the audio-video transmitting apparatus of 
claim 15 and said multiplexed data and/or control 
information; and separating means for separating said 
multiplexed data and/or control information in accordance 
with said multiplexing control information. 

The present invention according to claim 18 is an 
audio-video receiving apparatus comprising: main 
looking-listening means for looking at and listening to a 
broadcast program; and auxiliary looking-listening means 
for cyclically detecting the state of a broadcast program 
other than the broadcast program looked and listened 
through said main looking-listening means; wherein said 
detection is performed so that a program and/or data 
necessary when said broadcast program looked and listened 
through said main looking-listening means is switched to 
other broadcast program can be smoothly processed , and 

said data includes video data and/or audio data. 

The present invention according to claim 19 is the 
audio-video transmitting apparatus according to claim 1, 
wherein priority values can be changed in accordance with 
the situation by transmitting the offset value of 
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information showing the priority for processing of said 
data. 

The present invention according to claim 20 is an 
audio-video receiving apparatus comprising: receiving 
means for receiving encoded information to which the 
information concerned with the priority for processing 
under an overload state is previously added; and priority 
deciding means for deciding a threshold serving as a 
criterion for selecting whether to process an object in said 
information received by said receiving means; wherein 

the timing for outputting said received information 
is compared with the elapsed time after start of processing 
or the timing for decoding said received information is 
compared with the elapsed time after start of processing 
to change said threshold in accordance with the comparison 
result, and video data and/or audio data are or is included 
as said encoding object. 

The present invention according to claim 21 is the 
audio-video receiving apparatus according to claim 20, 
wherein retransmission-request-priority deciding means 
for deciding a threshold serving as a criterion for 
selecting whether to request retransmission of some of said 
information not received because it is lost under 
transmission when it is necessary to retransmit said 
information is included, and 

said decided threshold is decided in accordance with 
at least one of the priority controlled by said priority 
deciding means, retransmission frequency, lost factor of 
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information , insertion interval between in-frame-encoded 
frames, and grading of priority. 

The present invention according to claim 22 is an 
audio-video transmitting apparatus comprising: 
retransmission-priority deciding means for deciding a 
threshold serving as a criterion for selecting whether to 
request retransmission of some of said information not 
received because it is lost under transmission when 
retransmission of said unreceived information is requested 
is included, wherein said decided threshold is decided in 
accordance with at least one of the priority controlled by 
the priority deciding means of said audio-video receiving 
apparatus of claim 20, retransmission frequency, lost 
factor of information, insertion interval between in- 
frame-encoded frames, and grading of priority. 

The present invention according to claim 23 is an 
audio-video transmitting apparatus for transmitting said 
encoded information by using the priority added to said 
encoded information and thereby thinning it when ( 1 ) an 
actual transfer rate exceeds the target transfer rate of 
information for a video or audio or ( 2 ) it is decided that 
writing of said encoded information into a transmitting 
buffer is delayed as the result of comparing the elapsed 
time after start of transmission with a period to be decoded 
or output added to said encoded information. 

The present invention according to claim 25 is a data 
processing apparatus comprising: receiving means for 
receiving a data series including (1) time-series data for 



audio or video, (2) an inter-time-series-data priority 
showing the priority of the processing between said 
time-series-data values, and (3) a plurality of in- 
time-series-data priorities for dividing said time-series 
data value to show the processing priority between divided 
data values; and data processing means for performing 
processing by using said inter-time-series-data priority 
and said in-time-series-data priority together when 
pluralities of said time-series-data values are 
simultaneously present. 

The present invention according to claim 27 is a data 
processing apparatus comprising: receiving means for 
receiving a data series including (1) time-series data for 
audio or video, (2) an inter-time-series-data priority 
showing the priority of the processing between said 
time-series-data values, and (3) a plurality of in- 
time-series-data priorities for dividing said time-series 
data value to show the processing priority between divided 
data values; and data processing means for distributing 
throughput to each of said time-series-data values in 
accordance with said inter-time-series-data priority and 
moreover, adaptively deteriorating the processing quality 
of the divided data in said time-series data in accordance 
with said in-time-series-data priority so that each of said 
time-series-data values is kept within said distributed 
throughput . 

The present invention according to claim 29 is a data 
processing apparatus characterized by, when an in-time- 



series-data priority for a video is added every frame of 
said video and said video for each frame is divided into 
a plurality of packets , adding said in-time-series-data 
priority only to the header portion of a packet for 
transmitting the head portion of a frame of said video 
accessible as independent information. 

The present invention according to claim 31 is the data 
processing apparatus according to any one of claims 25 , 27 , 
and 29, wherein said in-time-series-data priority is 
described in the header of a packet to perform priority 
processing. 

The present invention according to claim 33 is the data 
processing apparatus according to any one of claims 25, 27, 
and 29, wherein the range of a value capable of expressing 
said in-time-series-data priority is made variable to 
perform priority processing. 

The present invention according to claim 34 is a data 
processing method comprising the steps of: inputting a data 
series including time-series data for audio or video and 
an inter-time-series-data priority showing the processing 
priority between said time-series data values; and 

processing priorities by using said inter-time- 
series-data priority as the value of a relative or absolute 
priority. 

The present invention according to claim 36 is a data 
processing method comprising the steps of: classifying 
time-series data values for audio or video; inputting a data 
series including said time-series data and a plurality of 
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in-time-series-data priorities showing the processing 
priority between said classified data values; and 
processing priorities by using said in-time-series-data 
priority as the value of a relative or absolute priority. 

Moreover, to solve the problem (CI), the present 
invention is characterized by: 

inputting, for example, a video as waveform data in 
accordance with the waveform-data-transmitting method of 
claim 63; or 

outputting, for example, a video as waveform data in 
accordance with the waveform-data-receiving method of 
claim 69. 

Moreover, to solve the problem (C2) r the present 
invention is characterized by: 

(d) outputting the execution time of each group 
obtained through estimation in accordance with the* 
waveform-data-receiving method of claim 69; or 

(d) inputting a data string constituted with the 
execution time of each group; and 

(e) computing the execution frequency of each group 
for completing decoding within a time required to transmit 
a code length determined by the designation of a rate 
controller or the like in accordance with each execution 
time of the receiving means in accordance with the 
wave-data-transmitting method of claim 63. 

Furthermore, to solve the problem (C3), the present 
invention is characterized by: 



(d) estimating the execution time of each group in 
accordance with the processing time required to encode a 
video and each execution frequency output by counting 
means ; and 

(e) estimating the processing time required to encode 
a video by using the above execution time and computing the 
execution frequency of each group in which the processing 
time does not exceed a time usable to process one sheet of 
picture determined by a frame rate given as the designation 
of a user in accordance with the waveform-data-transmitting 
method of claim 67. 

The present invention has the above structure to obtain 
the execution frequency of indispensable processing and 
that of dispensable processing, transmit the execution 
frequencies to the receiving side, and estimate the time 
required for each processing in accordance with the 
execution frequencies and the decoding time. 

By reducing each execution frequency of dispensable 
processing so that the time required for decoding becomes 
shorter than a designated time in accordance with the 
estimated time of each processing, it is possible to control 
the decoding time to the designated time or shorter and keep 
a delay small. 

Claims 67 and 73 are mainly listed as the inventions 
for solving the problem (CI). 

Moreover, it is possible to set the decoding execution 
time to a value equal to or less than a designated time by 
transmitting the execution time of indispensable 
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processing and that of dispensable processing estimated by 
the receiving side to the transmitting side and determining 
each execution frequency at the transmitting side in 
accordance with each execution time. 

Claims 75 and 77 are mainly listed as the inventions 
for solving the problem (C2). 

Moreover , it is possible to set the encoding estimation 
time to a value equal to or less than a user designated time 
by estimating the execution time of indispensable 
processing and that of dispensable processing and 
determining each execution frequency in accordance with 
each execution time and the user designated time determined 
by a frame rate designated by a user. 

Claim 79 is mainly listed as the invention for solving 
the problem (C3). 

Brief Description of the Drawings 

Figure 1 is a schematic block diagram of the 
audio-video transceiver of an embodiment of the present 
invention; 

Figure 2 is an illustration showing a reception control 
section and a separating section; 

Figure 3 is an illustration showing a method for 
transmitting and controlling video and audio by using a 
plurality of logical transmission lines; 

Figure 4 is an illustration showing a method for 
dynamically changing header information added to the data 
for a video or audio to be transmitted? 
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Figures 5(a) and 5(b) are illustrations showing a 
method for adding AL information; 

Figures 6(a) to 6(d) are illustrations showing 
examples of a method for adding AL information; 

Figure 7 is an illustration showing a method for 
transmitting information by dynamically multiplexing and 
separating a plurality of logical transmission lines; 

Figure 8 is an illustration showing a procedure for 
transmitting a broadcasting program; 

Figure 9(a) is an illustration showing a method for 
transmitting a video or audio considering the read and rise 
time of program or data when the program or data is present 
at a receiving terminal; 

Figure 9(b) is an illustration showing a method for 
transmitting a video or audio considering the read and rise 
time of program or data when the program or data is 
transmitted; 

Figure 10(a) is an illustration showing a method for 
corresponding to zapping; 

Figure 10(b) is an illustration showing a method for 
corresponding to zapping; 

Figure , 11(a) is an illustration showing a specific 
example of the protocol to be actually transferred between 
terminals ; 

Figure 11(b) is an illustration showing a specific 
example of the protocol to be actually transferred between 
terminals ; 
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Figure 12 is an illustration showing a specific example 
of the protocol to be actually transferred between 
terminals ; 

Figure 13(a) is an illustration showing a specific 
example of the protocol to be actually transferred between 
terminals ; 

Figurel3(b) is an illustration showing a specific 
example of the protocol to be actually transferred between 
terminals ; 

Figure 13(c) is an illustration showing a specific 
example of the protocol to be actually transferred between 
terminals ; 

Figure 14 is an illustration showing a specific example 
of the protocol to be actually transferred between 
terminals ; 

Figure 15 is an illustration showing a specific example 
of the protocol to be actually transferred between 
terminals; 

Figure 16(a) is an illustration showing a specific 
example of the protocol to be actually transferred between 
terminals ; 

Figure 16(b) is an illustration showing a specific 
example of the protocol to be actually transferred between 
terminals; 

Figure 17 is an illustration showing a specific example 
of the protocol to be actually transferred between 
terminals; 
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Figure 18 is an illustration showing a specific example 
of the protocol to be actually transferred between 
terminals ; 

Figure 19(a) is an illustration showing a specific 
example of the protocol to be actually transferred between 
terminals; 

Figure 19(b) is an illustration showing a specific 
example of the protocol to be actually transferred between 
terminals ; 

Figures 20(a) to 20(c) are block diagrams of 
demonstration systems of CGD of the present invention; 

Figure 21 is an illustration showing a method for 
adding a priority under overload at an encoder; 

Figure 22 is an illustration describing a method for 
deciding a priority at a receiving terminal under overload; 

Figure 23 is an illustration showing temporal change 
of priorities; 

Figure 24 is an illustration showing stream priority 
and object priority; 

Figure 25 is a schematic block diagram of a video 
encoder and a video decoder of an embodiment of the present 
invention; 

Figure 26 is a schematic block diagram of an audio 
encoder and an audio decoder of an embodiment of the present 
invention; 

Figures 27(a) and 27(b) are illustrations showing a 
priority adding section and a priority deciding section for 
controlling the priority of processing under overload; 
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Figures 28(a) to 28(c) are illustrations showing the 
grading for adding a priority; 

Figure 29 is an illustration showing a method for 
assigning a priority to multi-resolution video data; 

Figure 30 is an illustration showing a method for 
constituting a communication pay load; 

Figure 31 is an illustration showing a method for 
making data correspond to a communication pay load; 

Figure 32 is an illustration showing the relation 
between object priority, stream priority, and 
communication packet priority; 

Figure 33 is a block diagram of a transmitter of the 
first embodiment of the present invention; 

Figure 34 is an illustration of the first embodiment; 
Figure 35 is a block diagram of the receiver of the 
third embodiment of the present invention; 

Figure 36 is a block diagram of the receiver of the 
fifth embodiment of the present invention; 

Figure 37 is an illustration of the fifth embodiment;. 
Figure 38 is a block diagram of the transmitter of the 
sixth embodiment of the present invention; 

Figure 39 is a block diagram of the transmitter of the 
eighth embodiment of the present invention; 

Figure 40 is a flowchart of the transmission method 
of the second embodiment of the present invention; 

Figure 41 is a flowchart of the reception method of 
the fourth embodiment of the present invention; 
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Figure 42 is a flowchart of the transmission method 
of the seventh embodiment of the present invention; 

Figure 43 is a flowchart of the transmission method 
of the ninth embodiment of the present invention; 

Figure 44 is a block diagram showing an audio-video 
transmitter of the present invention; 

Figure 45 is a block diagram showing an audio-video 
receiver of the present invention; 

Figure 46 is an illustration for explaining priority 
adding means for adding a priority to a video and audio of 
an audio-video transmitter of the present invention; and 
Figure 47 is an illustration for explaining priority 
deciding means for deciding whether to perform decoding by 
interpreting the priority added to a video and audio of an 
audio-video receiver of the present invention. 



(Description of Symbols) 

11 Reception control section 

12 Separating section 

13 Transmitting section 

14 Video extending section (Picture extending section) 

15 video-extension control section (Picture-extension 
control section) 

16 Video synthesizing section (Picture synthesizing 
section) 

17 Output section 

18 Terminal control section 

4011 Transmission control section 
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4012 Video encoding section (Picture encoding section) 

4013 Reception control section 

4014 Video decoding section (Picture decoding section) 

4015 Video synthesizing section (Picture synthesizing 
section) 

4016 Output section 

4101 Video encoder (Picture encoder) 

4102 Video decoder (Picture decoder) 

301 Receiving means 

302 Estimating means 

303 Video decoder (i.e. Dynamic-picture or Moving picture 
decoder ) 

304 Frequency reducing means 

306 Output terminal 

307 Input terminal 

3031 Variable decoding means 

3032 Inverse orthogonal transforming means 

3033 Switching unit 

3034 Movement compensating means 

3035 Execution- time measuring means 
Best Mode for Carrying Out the Invention 

Embodiments of the present invention are described 
below by referring to the accompanying drawings. 

The embodiments described below mainly solve any one 
of the above problems (Al) to (A6). 

A "picture ( or video)" used for the present invention 
includes a static-picture and a moving-picture . Moreover, 
a purposed picture can be a two-dimensional picture like 
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computer graphics (CG) or three-dimensional picture data 
constituted with a wire-frame model. 

Figure 1 is a schematic block diagram of the 
audio-video transceiver of an embodiment of the present 
invention. 

In Figure 1, a reception control section 11 for 
receiving information and a transmitting section 13 for 
transmitting information are information transmitting 
means such as a coaxial cable , CATV, LAN, and modem. 
Communication environment can be the environment in which 
a plurality of logical transmission lines can be used 
without considering multiplexing means such as internet or 
the environment in which multiplexing means must be 
considered such as analog telephone or satellite broadcast. 

Moreover, a system for bidirectionally transferring 
video and audio between terminals such as a picture 
telephone or teleconference system or a system for 
broadcasting broadcast-type video and audio through 
satellite broadcast, CATV, or internet are listed as 
terminal connection systems. The present invention takes 
such terminal connection systems into consideration. 

A separating section 12 shown in Figure 1 is means for 
analyzing received information and separating data from 
control information. Specifically, the section 12 is means 
for decomposing the header information for transmission 
added to data and data or decomposing the header for data 
control added to the data and the contents of the data. A 
picture extending section 14 is means for extending a 
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received video. For example, a video to be extended can 
be the compressed picture of a standardized moving (dynamic) 
or static picture such as H.261, H.263, MPEG1/2, or JPEG 
or not. 

The picture-extension control section 15 shown in 
Figure 1 is means for monitoring the extended state of a 
video. For example, by monitoring the extended state of 
a picture, it is possible to empty-read a receiving buffer 
without extending the picture when the receiving buffer 
almost causes overflow and restart the extension of the 
picture after the picture is ready for extension. 

Moreover, in Figure 1, a picture synthesizing section 
16 is means for synthesizing an extended picture. A picture 
synthesizing method can be defined by describing a picture 
and its structural information (display position and 
display time (moreover, a display period can be included) ) , 
a method for grouping pictures, a picture display layer 
(depth) , an object ID (SSRC to be described later) , and the 
relation between attributes of them with a script language 
such as JAVA, VRML, or MHEG. The script describing the 
synthesizing method is input or output through a network 
or a local memory. 

Moreover, an output section 17 is a display or printer 
for outputting a picture synthesized result. A terminal 
control section 18 is means for controlling each section. 
Furthermore, it is possible to use a structure for extending 
an audio instead of a picture (it is possible to constitute 
the structure by changing a picture extending section to 
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an audio extending section , a picture extension control 
section to an audio extension control section , and a picture 
synthesizing section to an audio synthesizing section) or 
a structure for extending a picture and an audio and 
synthesizing and displaying them while keeping temporal 
synchronization . 

Furthermore, it is possible to transmit a picture and 
an audio by using a picture compressing section for 
compressing a picture , a picture compression control 
section for controlling the picture compressing section, 
an audio compressing section for compressing an audio, and 
an audio compression control section for controlling the 
audio compressing section. 

Figure 2 is an illustration showing a reception control 
section and a separating section. 

By constituting the reception control section 11 shown 
in Figure 1 with a data receiving section 101 for receiving 
data and a control information receiving section 102 for 
receiving the control information for controlling data and 
the separating section 12 with a transmission format 
storing section 103 for storing a transmission structure 
(to be described later in detail) for interpreting 
transmission contents and a transmission inf ormation 
interpreting section 104 for interpreting transmission 
contents in accordance with the transmission structure 
stored in the transmission format storing section 103, it 
is possible to independently receive data and control 
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information. Therefore, for example, it is easy to delete 
or move a received video or audio while receiving it. 

As described above, it is possible for the 
communication environment purposed by the reception 
control section 11 to use a communication environment 
(internet profile) in which a plurality of logical 
transmission lines can be used without considering 
multiplexing means like internet or a communication 
environment (Raw profile) in which multiplexing means must 
be considered like analog telephone or satellite broadcast. 
However, a user premises a communication environment in 
which a plurality of logical transmission lines (logical 
channels) are prepared (for example, in the case of a 
communication environment in which TCP/IP can be used, the 
expression referred to as "communication port " is generally 
used) . 

Moreover, as shown in Figure 2, it is assumed that the 
reception control section 11 receives one type of data 
transmission line or more and one type of control logical 
transmission line for controlling data to be transmitted 
or more. It is also possible to prepare a plurality of 
transmission lines for transmitting data and only one 
transmission line for controlling data. Moreover, it is 
possible to prepare a transmission line for controlling 
data every data transmission like the RTP/RTCP also used 
for H.323. Furthermore, when considering the broadcast 
using UDP, it is possible to use a communication system using 
a single communication port (multicast address). 
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Figure 3 is an illustration for explaining a method 
for transmitting and controlling video and audio by using 
a plurality of logical transmission lines. The data to be 
transmitted is referred to as ES (Elementary Stream) , which 
can be picture information for one frame or picture 
information in GOBs or macroblocks smaller than one frame 
in the case of a picture. 

In the case of an audio, it is possible to use a fixed 
length decided by a user. Moreover, the data-control 
header information added to the data to be transmitted is 
referred to as AL (Adaptation Layer information). The 
information showing whether it is a start position capable 
of processing data, information showing data-reproducing 
time, and information showing the priority of data 
processing are listed as the AL information. Data control 
information of the present invention corresponds to the AL 
information. Moreover, it is not always necessary for the 
ES and AL used for the present invention to coincide with 
the contents defined by MPEG1/2. 

The information showing whether it is a start position 
capable of processing data specifically includes two types 
of information. First one is a flag for random access, that 
is, the information showing that it can be individually read 
and reproduced independently of preceding or following data 
such as intra-frame (I picture) in the case of a picture. 
Second one is the information capable of defining an access 
flag as a flag for showing that it can be individually read, 
that is, the information showing that it is the head of 
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pictures in GOBs or macroblocks in the case of a picture. 
Therefore, absence of an access flag shows the middle of 
data. Both random access flag and access flag are not 
always necessary as the information showing that it is a 
start position capable of processing data. 

There is a case in which no problem occurs even if both 
the flags are not added in the case of the real time 
communication such as a teleconference system. However, 
to simply perform edition, a random access flag is necessary. 
It is also possible to decide whether a flag is necessary 
or which flag is necessary through a communication channel 
before transferring data. 

The information indicating a data reproducing time 
shows the information for time synchronization when a 
picture and an audio are reproduced, which is referred to 
as PTS (Presentation Time Stamp) in the case of MEPG1/2. 
Because time synchronization is not normally considered in 
the case of the real time communication such as a 
teleconference system, the information representing a 
reproducing time is not always necessary. The time 
interval between encoded frames may be necessary 
information. 

By making the receiving side adjust a time interval, 
it is possible to prevent a large fluctuation of frame 
intervals. However, by making the receiving side adjust 
the reproducing interval, a delay may occur. Therefore, 
it may be decided that the time information showing the frame 
interval between encoded frames is unnecessary. 
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To decide whether the information showing a data 
reproducing time represents a PTS or frame interval , it is 
also possible to decide that the data reproducing time is 
not added to data before transmitting the data and 
communicate the decision to a receiving terminal through 
the communication channel and transmit the data together 
with decided data control information. 

When the information showing the priority for 
processing data cannot be processed or transmitted due to 
the load of a receiving terminal or that of a network , it 
is possible to reduce the load of the receiving terminal 
or network by stopping the processing or transmission of 
the data. 

The receiving terminal is able to process the data with 
the picture-extension control section 15 and the network 
is able to process the data with a relay terminal or router. 
The priority can be expressed by a numerical value or a flag. 
Moreover, by transmitting the offset value of the 
information showing the data-processing priority as 
control information or data control information (AL 
information) together with data and adding the offset value 
to the priority previously assigned to a video or audio in 
the case of a sudden fluctuation of the load of a receiving 
terminal or network, it is possible to set a dynamic priority 
corresponding to the operation state of a system. 

Furthermore, by transmitting the information for 
identifying presence/absence of scramble, 
presence/absence of copyright, and original or copy as 
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control information together with a data identifier (SSRC) 
separately from data as control information , it is 
simplified to cancel the scramble at a relay node. 

Moreover, the information showing the data processing 
priority can be added every stream constituted with the 
aggregation of frames of a plurality of pictures or audios 
or every frame of video or audio. 

Priority adding means for deciding the encoded- 
information processing priority under overload in 
accordance with the predetermined rules by the encoding 
method such as H.263 or G.723 and making the encoded 
information correspond to the decided priority is provided 
for a transmitting terminal unit (see Figure 46). 

Figure 46 is an illustration for explaining priority 
adding means 5201 for adding a priority to a picture and 
an audio. 

That is, as shown in Figure 46, a priority is added 
to encoded-video data (to be processed by video encoding 
means 5202 ) and encoded-audio data (to be processed by audio 
encoding means 5203 ) in accordance with predetermined rules . 
The rules for adding priorities are stored in priority 
adding rules 5204. The rules include rules for adding a 
priority higher than that of a P-frame ( inter-f rame encoded 
picture frame) to an I-frame (intra-frame encoded picture 
frame) and rules for adding a priority lower than that of 
an audio to a picture. Moreover, it is possible to change 
the rules in accordance with the designation of a user. 
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Priority-adding objects are scene changes in the case 
of a picture or an audio block and audioless block in the 
case of a picture frame , stream, or audio designated by an 
editor or user. 

To add a priority in picture or audio frames for 
defining the processing priority under overload, the 
following methods are considered: a method for adding a 
priority to a communication header and a method for 
embedding a priority in the header of a bit stream in which 
a video or audio is encoded under encoding. The former 
makes it possible to obtain the information for priority 
without decoding it and the latter makes it possible to. 
independently handle a single bit stream without depending 
on a system. 

When one picture frame (e.g. intra-frame encoded 
I-frame or inter-frame encoded P- or B-frame) is divided 
into a plurality of transmission packets, a priority is 
added only to a communication header for transmitting the 
head of a picture frame accessible as independent 
information in the case of a picture (when priorities are 
equal in the same picture frame, it is possible to assume 
that the priorities are not changed before the head of the 
next accessible picture frame appears). 

Moreover, it is possible to realize configuration in 
accordance with control information by making the range of 
a value capable of expressing a priority variable (for 
example, expressing time information with 16 bits or 32 bits 
depending on the purpose) . 
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Furthermore , in the case of a decoder , priority 
deciding means for deciding a processing method is provided 
for a receiving terminal unit in accordance with the 
priority under overload of received various encoded pieces 
of information (see Figure 47). 

Figure 4 7 is an illustration for interpreting 
priorities added to a picture and an audio and explaining 
priority deciding means 5301 for deciding whether to 
perform decoding. 

That is, as shown in Figure 47, the priorities include 
a priority added to each stream of each picture or audio 
and a priority added to each frame of a picture or audio. 
It is possible to use these priorities independently or by 
making a frame priority correspond to a stream priority. 
The priority deciding means 5301 decides a stream or frame 
to be decoded in accordance with these priorities. 

Decoding is performed by using two types of priorities 
for deciding a processing priority under overload at a 
terminal . 

That is, a stream priority (inter-time-series 
priority) for defining a relative priority between bit 
streams such as a picture and audio and a frame priority 
( intra-time-series priority) for defining a relative 
priority between decoding units such as picture frames in 
the same stream are defined (Figure 24). 

The former stream priority makes it possible to handle 
a plurality of videos or audios • The latter frame priority 
makes it possible to change scenes or add different 



- 36 - 



priorities even to the same intra-frame encoded picture 
frames (I-frame) in accordance with the intention of an 
editor . 

By making a stream priority correspond to a time 
assigned to an operating system (OS) for encoding or 
decoding a picture or audio or a processing priority and 
thereby controlling the stream priority, it is possible to 
control a processing time at an OS level. For example, in 
the case of Windows95/NT of Microsoft Corporation, a 
priority can be defined at five OS levels. By realizing 
encoding or decoding means by software in threads, it is 
possible to decide a priority at an OS level to be assigned 
to each thread in accordance with the stream priority of 
a purposed stream. 

The frame priority and stream priority described above 
can be applied to a transmission medium or data-recording 
medium. For example, by defining the priority of a packet 
to be transmitted as an access unit priority, it is possible 
to decide a priority concerned with packet transmission or 
a priority for processing by a terminal under overload in 
accordance with the relation between frame priority and 
stream priority such as the relation of Access Unit Priority 
= Stream Priority - Frame Priority. 

Moreover, it is possible to decide a priority by using 
a floppy disk or optical disk as a data-recording medium. 
Furthermore, it is possible to decide a priority by using 
not only a recording medium but also an object capable of 
recording a program such as an IC card or ROM cassette. 
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Furthermore, it is possible to use a repeater for a picture 
or audio such as a router or gateway for relaying data. 

As a specific method for using a priority, when a 
receiving terminal is overloaded, priority deciding means 
for deciding the threshold of the priority of encoded 
information to be processed is set to a picture-extension 
control section or audio-extension control section and the 
time to be displayed (PTS) is compared with the elapsed time 
after start of processing or the time to be decoded (DTS) 
is compared with the time elapsed time after start of 
processing to change thresholds of the priority of encoded 
information to be processed in accordance with the 
comparison result (it is also possible to refer to the 
insertion interval of I-frame or the grading of a priority 
as the information for changing thresholds ) . 

In the case of the example shown in Figure 20(a), a 
picture with the size of captured QCIF or CIF is encoded 
by an encoder (H.263) under encoding to output a time stamp 
(PTS) showing the time for decoding (DTS) or the time for 
displaying the picture, priority information showing 
processing sequence under overload (CGD, Computational 
Graceful Degradation) , frame type (SN) , and sequence number 
together with encoded information. 

Moreover, in the case of the example shown in Figure 
20(b), an audio is also recorded through a microphone and 
encoded by an encoder (6.72.1) to output a time stamp (PTS) 
showing the time for decoding (DTS) or the time for 
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reproducing an audio, priority information (CGD) , and 
sequence number (SN) together with encoded information. 

Under decoding, as shown in Figure 20(c), a picture 
and an audio are supplied to separate buffers to compare 
their respective DTS (decoding time) with the elapsed time 
after start of processing. When DTS is not delayed, the 
picture and the audio are supplied to their corresponding 
decoders (H.263 and G.721). 

The example in Figure 21 describes a method for adding 
a priority by an encoder under overload. For a picture, 
high priorities of "0" and "1" are assigned to I- frame 
(intra-frame encoded picture frame) (the smaller a 
numerical becomes, the lower a priority becomes) . P-frame 
has a priority of "2" which is lower than that of I-frame. 
Because two levels of priorities are assigned to I-frame, 
it is possible to reproduce only I-frame having a priority 
of "0" when a terminal for decoding has a large load. 
Moreover, it is necessary to adjust the insertion interval 
of I-frame in accordance with a priority adding method. 

The example in Figure 22 shows an illustration showing 
a method for deciding a priority at a receiving terminal 
under overload. The priority of a frame to be disused is 
set to a value larger than a cutoff Priority . That is, every 
picture frame is assumed as an object to be processed. It 
is possible to previously know the maximum value of 
priorities added to picture frames by communicating it from 
the transmitting side to the receiving side (step 101). 
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When DTS is compared with the elapsed time after start 
of processing and resultantly, the elapsed time is larger 
than DTS (when decoding is not in time), the threshold of 
the priority of a picture or audio to be processed is 
decreased to thin out processings (step 102). However, 
when the elapsed time after start of processing is smaller 
than DTS (decoding is in time) , the threshold of a priority 
is increased in order to increase the number of pictures 
or audio which can be processed (step 103). 

If the image from one before is skipped by P-frame, 
no processing is performed. If not, a priority offset value 
is added to the priority of a picture frame (or audio frame) 
to compare the priority offset value with the threshold of 
the priority. When the offset value does not exceed the 
threshold, data to be decoded is supplied to a decoder (step 
104) . 

A priority offset allows the usage of previously 
checking the performance of a machine and communicating the 
offset to a receiving terminal (it is also possible that 
a user issues designation at the receiving terminal) and 
the usage of changing priorities of a plurality of video 
and audio streams in streams (for example, thinning out 
processings by increasing the offset value of the rearmost 
background) . 

When a multi-stream is purposed, it is also possible 
to add a priority for each stream and decide the skip of 
decoding of a picture or audio. Moreover, in the case of 
real time communication, it is possible to decide whether 
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decoding is advanced or delayed at the terminal by handling 
the TR (Temporary Reference) of H.263 similarly to DTS and 
realize the skipping same as described above. 

Figure 23 is an illustration showing temporal change 
of priorities by using the above algorithm. 

Figure 23 shows the change of a priority to be added 
to a picture frame. This priority is a priority for 
deciding whether to perform decoding when a terminal is 
overloaded, which is added every frame. The smaller the 
value of a priority becomes , the higher the priority becomes . 
In the case of the example in Figure 23, 0 has the highest 
priority. When the threshold of a priority is 3, a frame 
having a priority to which a value larger than 3 is added 
is disused without being decoded and a frame having a 
priority to which a value of 3 or less is added is decoded. 
By selectively discussing frames in accordance with 
priorities , it is possible to control the load of a terminal . 
It is also possible to dynamically decide the priority 
threshold in accordance with the relation between the 
present processing time and the decoding time (DTS) to be 
added to each frame . This technique can be applied not only 
to a picture frame but also to an audio in accordance with 
the same procedure. 

In the case of a transmission line such as internet, 
when it is necessary to retransmit encoded information lost 
under transmission, it is possible to retransmit only a 
picture or audio required by the receiving side by providing 
a retransmission request priority deciding section for 
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deciding the threshold of the priority of the encoded 
information to be retransmitted for a reception control 
section and deciding the threshold of the priority added 
to the encoded information whose retransmission should be 
requested in accordance with the information for priority, 
retransmission frequency, loss rate of information, 
insertion interval of intra-frame encoded frame, grading 
of priority (e.g. five-level priority) which are controlled 
by the priority deciding section. If the retransmission 
frequency or loss rate of information is too large, it is 
necessary to raise the priority of the information to be 
retransmitted and lower the retransmission or loss rate. 
Moreover, by knowing the priority used for the priority 
deciding section, it is possible to prevent the information 
to be processed from being transmitted. 

In the case of a transmitting terminal, when an actual 
transfer rate exceeds the target transfer rate of the 
information of the transmitting terminal or when writing 
of the encoded information into a transmitting buffer is 
delayed as the result of comparing the elapsed time after 
start of transfer processing with the time added to the 
encoded information to be decoded or displayed, it is 
possible to transmit a picture or audio matching with the 
target rate by using a priority added to encoded information 
and used by the priority deciding section of the receiving 
terminal when the terminal is overloaded and thereby 
thinning out transmissions of information. Moreover, by 
introducing the processing skipping function under 
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overload performed at the receiving-side terminal into the 
transmitting-side terminal , it is possible to control a 
failure due to overload of the transmitting-side terminal. 

By making it possible to transmit only necessary 
information out of the above-described AL information 
according to necessity , it is possible to adjust the amount 
of information to be transmitted to a narrow-band 
communication channel such as an analog telephone line. It 
is possible to recombine the AL information (data control 
information) used for the transmitting side by deciding the 
data control information to be added to data at a 
transmitting-side terminal before transmitting the data, 
communicating the data control information to be used to 
a receiving terminal as control information (for example, 
using only a random access flag) , and rewriting at the 
receiving-side terminal based on the obtained control 
information the information about a transmission structure 
(showing which AL information is used) stored in the 
transmission format storing section 103 (see Figure 16). 

Figure 4 is an illustration for explaining a method 
for dynamically changing header information added to the 
data for a picture or audio to be transmitted. In the case 
of the example in Figure 4, the data (ES) to be transmitted 
is decomposed into data pieces and the identifying 
information (sequence number) for showing the sequence of 
data, the information (marker bit) showing whether it is 
a start position capable of processing data pieces, and time 
information (time stamp) concerned with transfer of data 
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pieces are added to data pieces in the form of communication 
headers by assuming that the above pieces of information 
correspond to transmission control information of the 
present invention. 

Specifically, RTP (Realtime Transfer Protocol, 
RFC1889 ) uses the information for the above sequence number, 
marker bit, time stamp, object ID (referred to as SSRC), 
and version number as communication headers. Though a 
header-information item can be extended, the above items 
are always added as fixed items . However, when the realtime 
communication such as the case of a video telephone and 
transmission of accumulated media such as the case of 
video-on-demand are present together in an environment in 
which a plurality of different encoded pictures or audio 
are simultaneously transmitted, identifying means is 
necessary because meanings of communication headers are 
different from each other. 

For example, time-stamp information shows PTS that is 
a reproducing time as previously described in the case of 
MPEG1/2. In the case of H.261 or H.263, however, the 
time-stamp information shows a time interval when the 
information is encoded. However, to process H.263 
synchronously with an audio, it is necessary to show that 
a time stamp is PTS information. This is because time- 
stamp information shows the time interval between encoded 
frames in the case of H. 263 and it is defined by RTP that 
the time stamp of the first frame is random. 
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Therefore , it is necessary to add a flag showing 
whether a time stamp is PTS as (a) communication header 
information (it is necessary to extend a communication 
header) or (b) header information for pay load of H.263 or 
H.261 (that is, AL information) (in this case, it is 
necessary to extend pay load information) . 

A marker bit serving as the information showing whether 
it is a start position capable of processing data pieces 
is added as RTP header information. Moreover / as described 
above, there is a case in which it is necessary to provide 
an access flag showing that it is a start position capable 
of accessing data and a random access flag showing that it 
is possible to access data at random for AL information. 
Because doubly providing flags for a communication header 
lowers the efficiency, a method of substituting an AL flag 
by a flag prepared for the communication header is also 
considered. 

(c) The problem is solved by newly providing a flag 
showing that an AL flag is substituted by the header added 
to a communication header without adding a flag to AL for 
the communication header or defining that the marker bit 
of the communication header is the same as that of AL (it 
is expected that interpretation can be quickly performed 
compared to the case of providing a flag for AL) . That is, 
a flag is used which shows whether the marker bit has the 
same meaning as the flag of AL. In this case, it is 
considered to improve the communication header or describe 
it in an extension region. 



However, (d) it is also possible to interpret the 
meaning of the marker bit of the communication header so 
as to mean that at least either of a random access flag and 
an access flag is present in AL. In this case, it is possible 
to know that the meaning of interpretation is changed from 
the conventional case by the version number of the 
communication header. Moreover, processing is simplified 
by providing an access flag or random access flag only for 
the communication header or the header of AL ( for the former, 
a case of providing the flag for both the headers is 
considered but it is necessary to newly extend the 
communication header) . 

It is already described to add the information showing 
the priority of data processing as the information for AL. 
By adding the data-processing priority to the communication 
header, it is possible to decide the processing of the 
data-processing priority without interpreting the contents 
of data also on a network. Moreover, in the case of IPv6, 
it is possible to add the priority at a layer lower than 
the level of RTP. 

By adding a timer or counter for showing the effective 
period of data processing to the communication header of 
RTP, it is possible to decide how the state of a transmitted 
packet changes. For example, when necessary decoder 
software is stored in a memory having a low access speed, 
it is possible to decide the information required by a 
decoder and when the information is required by a timer or 
counter. In this case, the information for the priority 
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of a timer or counter or the information for the priority 
of data processing is unnecessary for AL information 
depending on the purpose. 

Figures 5(a) and 5(b) and Figures 6(a) to 6(d) are 
illustrations for explaining a method for adding AL 
information. 

By sending the control information for communicating 
whether to add AL to only the head of the data to be 
transmitted as shown in Figure 5(a) or whether to add AL 
to each data piece after decomposing the data to be 
transmitted (ES) into one data piece or more to a receiving 
terminal as shown in Figure 5(b) , it is possible to select 
the grading for handling transmission information. Adding 
AL to subdivided data is effective when access delay is a 
problem. 

As described above, to previously communicate 
recombination of data control information at the receiving 
side or change of methods for arranging data control 
information to data to a receiving-side terminal, 
receiving-terminal correspondence can be smoothly 
performed by using the expression of a flag, counter, or 
timer and thereby, preparing the expression as AL 
information or as a communication header to communicate it 
to the receiving terminal. 

In the case of the above examples, a method for avoiding 
duplication of the header of RTP (or communication header) 
with AL information and a method for extending the 
communication header of RTP or AL information are described. 
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However, it is not always necessary for the present 
invention to use RTP . For example, it is possible to newly 
define an original communication header or AL information 
by using UDP or TCP. Though the internet profile uses RTP 
sometimes, a multifunctional header such as RTP is not 
defined in the Raw profile. The following four types of 
concepts are considered for AL information and 
communication header (see Figures 6(a) to 6(d)). 

(1) The header information of RTP or AL information 
is corrected and extended so that the header information 
already assigned to RTP and that already assigned to AL are 
not overlapped (particularly, the information for a time 
stamp is overlapped and the priority information for a timer, 
counter, or data processing becomes extension information) . 
Or, it is possible to use a method of not extending the header 
of RTP or not considering duplication of AL information with 
information of RTP . They correspond to the contents having 
been shown so far. Because a part of RTP is already 
practically used for H.323, it is effective to extend RTP 
having compatibility. (See Figure 6(a).) 

(2) Independently of RTP, a communication header is 
simplified (for example, using only a sequence number) and 
remainder is provided for AL information as multifunctional 
control information. Moreover, by making it possible to 
variably set items used for AL information before 
communication, it is possible to specify a flexible 
transmission format. (See Figure 6(b).) 
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(3) Independently of RTP, AL information is simplified 
(for an extreme example, no information is added to AL) and 
every control information is provided for a communication 
header. A sequence number, time stamp, marker bit, pay load 
type, and object ID frequently used as communication 
headers are kept as fixed information and data-processing 
priority information and timer information are 
respectively provided with an identifier showing whether 
extended information is present as extended information to 
refer to the extended information if the information is 
defined. (See Figure 6(c).) 

(4) Independently of RTP, a communication header and 
AL information are simplified and a format is defined as 
a packet separate from the communication header or AL 
information to transmit the format. For example, a method 
is also considered in which only a marker bit, time stamp, 
and object ID are defined for AL information, only a sequence 
number is defined for a communication header, and pay load 
information, data-processing priority information , and 
timer information are defined as a transmission packet 
(second packet) separate from the above information and 
transmitted. (See Figure 6(d).) 

As described above, when considering a purpose and 
header information already added to a picture or audio, it 
is preferable so as to be able to freely define (customize) 
a packet (second packet) to be transmitted separately from 
a communication header, AL information, or data in 
accordance with the purpose. 
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Figure 7 is an illustration for explaining a method 
for transmitting information by dynamically multiplexing 
and separating a plurality of logical transmission lines. 
The number of logical transmission lines can be decreased 
by providing an information multiplexing section capable 
of starting or ending multiplexing of the information for 
logical transmission lines for transmitting a plurality of 
pieces of data or control information in accordance with 
the designation by a user or the number of logical 
transmission lines for a transmitting section and an 
information separating section for separating multiplexed 
information for a reception control section. 

In Figure 7, the information multiplexing section is 
referred to as "Group MUX" and specifically, it is possible 
to use a multiplexing system such as H.223. It is possible 
to provide the Group MUX for a transmitting/receiving 
terminal. By providing the Group MUX for a relay router 
or terminal, it is possible to correspond to a narrow-band 
communication channel. Moreover, by realizing Group MUX 
with H.223, it is possible to interconnect H.22 3 and H.324. 

To quickly fetch the control information (multiplexing 
control information) for the information multiplexing 
section, it is possible to reduce a delay due to multiplexing 
by transmitting the control information in the information 
multiplexing section through another logical transmission 
line without multiplexing the control information with data 
by the information multiplexing section. Thereby, it is 
possible for a user to select whether to keep the consistency 
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with conventional multiplexing or reduce a delay due to 
multiplexing by communicating and transmitting whether to 
multiplex the control information concerned with the 
information multiplexing section with data and transmit 
them or transmit the control information through another 
logical transmission line without multiplexing the 
information with the data. In this case, the multiplexing 
control information concerned with the information 
multiplexing section is information showing the content of 
multiplexing about how the information multiplexing 
section performs multiplexing for each piece of data. 

As described above, similarly, it is possible to 
transmit the notification of a method for transmitting at 
least the information for communicating the start and end 
of multiplexing, information for communicating the 
combination of logical transmission lines to be multiplexed, 
and control information concerned with multiplexing 
(multiplexing control information) as control information 
in accordance with an expression method such as a flag, 
counter, or timer or reduce the setup time at the receiving 
side by transmitting data control information to a 
receiving-side terminal together with data. Moreover, as 
previously described, it is possible to provide an item for 
expressing a flag, counter, or timer for the transmission 
header of RTP. 

When a plurality of information multiplexing sections 
or a plurality of information separating sections are 
present, it is possible to identify to which information 
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multiplexing section the control information (multiplexing 
control information) belongs by transmitting the control 
information (multiplexing control information) together 
with an identifier for identifying an information 
multiplexing section or information separating section. 
The control information (multiplexing control information) 
includes a multiplexing pattern. Moreover, by using a 
table of random number and thereby, deciding an identifier 
of an information multiplexing section or information 
separating section between terminals, it is possible to 
generate an identifier of the information multiplexing 
section. For example, it is possible to generate random 
numbers in a range determined between transmitting and 
receiving terminals and use the largest value for the 
identifier (identification number) of the information 
multiplexing section. 

Because the data multiplexed by the information 
multiplexing section is conventionally different from the 
media type defined in RTP, it is necessary to define the 
information showing that it is information multiplexed by 
the information multiplexing section (new media type H.223 
is defined) for the pay load type of RTP. 

By arranging the information to be transmitted by or 
recorded in the information multiplexing section in the 
sequence of control information and data information so as 
to improve the access speed to multiplexed data, it is 
expected to quickly analyze multiplexed information. 
Moreover, it is possible to quickly analyze header 
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information by fixing an item which is described in 
accordance with the data control information added to 
control information and adding and multiplexing an 
identifier (unique pattern) different from data. 

Figure 8 is an illustration for explaining the 
transmission procedure of a broadcasting program. By using 
the relation between the identifier of a logical 
transmission line and the identifier of a broadcasting 
program as the information of the broadcasting program and 
thereby, transmitting control information or adding the 
identif ier of a broadcasting program to data as data control 
information (AL information) , it is possible to identify 
that the data transmitted through a plurality of 
transmission lines is broadcasted for which program. 
Moreover, by transmitting the relation between the 
identifier of data (SSRC in the case of RTP) and the 
identifier of a logical transmission line (e.g. port number 
of LAN) to a receiving-side terminal as control information 
and transmitting corresponding data after it is confirmed 
that the control information can be received by the 
receiving-side terminal (Ack/Reject) , it is possible to 
form the correspondence between data pieces even if control 
information and data are respectively transmitted through 
an independent transmission line. 

By combining an identifier showing the transmission 
sequence of broadcasting programs or data pieces with the 
information for a counter or timer for showing a term of 
validity in which broadcasting program or data can be used 
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as information, adding the combined identifier and 
information to the broadcasting program or data, and 
transmitting them, it is possible to realize broadcasting 
without return channel (when the term of validity almost 
expires, reproduction of the information or data for a 
broadcasting program is started even if information is 
insufficient). Moreover, a method can be considered in 
which control information and data are broadcasted without 
being separated from each other by using the address of a 
single communication port (multicast address). 

In the case of communication with no back channel, it 
is necessary to transmit control information sufficiently 
before transmitting data so as to enable the receiving 
terminal to know a structural information of data. 
Moreover, control information should be transmitted 
through a transmission channel free from packet loss and 
having a high reliability. However, when using a 
transmission channel having a low reliability, it is 
necessary to cyclically transmit the control information 
having the same transmission sequence number. This is not 
restricted to the case of transmitting the control 
information concerned with a setup time. 

Moreover, it is possible to flexibly control and 
transmit data by selecting an item which can be added as 
data control information (e.g. access flag, random access 
flag, data reproducing time (PTS), or data-processing- 
priority information), deciding whether to transmit the 
data control information together with the identifier 
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(SSRC) of data as control information through a logical 
transmission line different from that of the data or 
transmit the data control information as data control 
information (information for AL) together with the data at 
the transmitting side before transmitting the data, and 
communicating and transmitting the data to the receiving 
side as control information. 

Thereby, it is possible to transmit data information 
without adding information to AL. Therefore, to transmit 
the data for a picture or audio by using RTP, it is 
unnecessary to extend the definition of the payload having 
been defined so far. 

Figures 9(a) and 9(b) are illustrations showing a 
picture or audio transmission method considering the read 
time and rise time of program or data. Particularly , when 
the resources of a terminal are limited like the case of 
satellite broadcasting or a portable terminal having no 
return channel and being unidirectional, program or data 
is present and used at a receiving-side terminal, a 
necessary program (e.g. H.263, MPEG1/2, or software of 
audio decoder) or data (e.g. video data or audio data) is 
p regen t in a memory (e.g. DVD, hard disk, or file server 
on network) requiring a lot of read time, it is possible 
to reduce the setup time of program or data required in 
advance by previously receiving it as control information 
or receiving it together with data as data control 
information in accordanc with the expression method such 
as the identifier for identifying the program or data, 
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identifier (e.g. SSRC, or Logical Channel Number) of a 
stream to be transmitted , or a flag, counter ( count - 
up/down) , or timer for estimating the point of time 
necessary for a receiving terminal (Figure 18). 

When program or data is transmitted , by transmitting 
the program or data from the transmitting side together with 
the information showing the storage destination (e.g. hard 
disk or memory) of the program or data at a receiving 
terminal, time required for start or read, relation between 
the type or storage destination of a terminal and the time 
required for start or read (e.g. relation between CPU power, 
storage device, and average response time) , and utilization 
sequence, it is possible to schedule the storage 
destination and read time of the program or data if the 
program or data necessary for the receiving terminal is 
actually required. 

Figures 10(a) and 10(b) are illustrations for 
explaining a method for corresponding to zapping (channel 
change of TV) . 

When it is necessary to execute a program at a receiving 
terminal differently from the case of conventional 
satellite broadcasting for receiving only pictures, the 
setup time until the program is read and started is a large 
problem. The same is true for the case in which available 
resources are limited like the case of a portable terminal. 

It is expected that the setup time at a receiving- 
side terminal can be decreased by (a) using a main 
looking-listening section by which the user looks at and 
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listens to, and an auxiliary looking-listening section in 
which a receiving terminal cyclically monitors programs 
other than the program looked and listened by a user and 
receiving the relation between identifier for identifying 
program or data required in advance, information for a flag, 
counter, or timer for estimating the point of time necessary 
for the receiving terminal, and program as control 
information (information transmitted by a packet different 
from that of data to control terminal processing) or as data 
control information (information for AL), and preparing 
read of the program or data together with data as one of 
the settlement measures when to program or data necessary 
for a program other than the program looked and listened 
by the user is present in a memory requiring a lot of time 
for read. 

It is possible to prevent a screen from stopping under 
setup by setting a broadcasting channel for broadcasting 
only heading pictures of the pictures broadcasted through 
a plurality of channels and switching programs by a user, 
and thereby, when necessary program or data is present in 
a memory requiring a lot of time for read, temporarily 
selecting the heading picture of a program required by the 
user and showing it for the user or showing that program 
or data is currently read, and restarting the program 
required by the user after necessary program or data is read 
by the memory as the second one of the settlement measures. 
The above heading pictures include broadcasted pictures 
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obtained by cyclically sampling programs broadcasted 
through a plurality of channels. 

Moreover, a timer is a time expression and shows the 
point of time when a program necessary to decode a data 
stream sent from the transmitting side is necessary. A 
counter is the basic time unit determined between 
transmitting and receiving terminals , which can be 
information showing what-th time. A flag is transmitted 
and communicated together with the data transmitted before 
the time necessary for setup or control information 
(information transmitted through a packet different from 
that of data to control terminal processing) . It is 
possible to transmit the timer and counter by embedding them 
in data or transmit them as control information. 

Furthermore , to decide a setup time, the time in which 
setup is performed can be estimated by, when using a 
transmission line such as ISDN operating on the clock base, 
using a transmission serial number for identifying a 
transmission sequence as transmission control information 
in order to communicate from the transmitting terminal to 
the receiving terminal a time point when program or data 
is required and thereby communicating the serial number to 
a receiving terminal together with data as data control 
information or as control information. Furthermore, when 
a transmission time is fluctuated due to jitter or delay 
like internet, it is necessary to add the transmission time 
to the setup time by considering the propagation delay of 
transmission in accordance with jitter or delay time by the 
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means for realizing RTCP (media transmission protocol used 
for intern t) . 

Figures 11(a) to 19(b) are illustrations showing 
specific examples of protocols actually transferred 
between terminals . 

A transmission format and a transmission procedure are 
described in ASN.l. Moreover, the transmission format is 
extended on the basis of H. 245 of ITU. As shown in Figure 
11(a), objects of a picture and audio can have a hierarchical 
structure. In the case of this example, each object ID has 
the attributes of a broadcasting-program identifier 
(program ID) and an object ID (S SRC) and the structural 
information and synthesizing method between pictures are 
described by a script language such as Java or VRML. 

Figure 11(a) is an illustration showing examples of 
the relation between objects. 

In Figure 11(a), objects are media such as an 
audio-video, CG, and text. In the case of the examples in 
Figure 11(a) , objects constitute a hierarchical structure. 
Each object has a program number "Program ID" corresponding 
to TV channel) and an object identifier "Object ID" for 
identifying an object. When transmitting each object in 
accordance with RTP (media transmission protocol for 
transmitting media used for internet, Realtime Transfer 
Protocol), it is possible to easily identify the object by 
making the object identifier correspond to SSRC 
(synchronous source identifier) . Moreover, it is possible 
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to describe the structure between objects with a 
description language such as JAVA or VRML. 

Two types of methods for transmitting the objects are 
considered. One is the broadcasting type in which the 
objects are unilaterally transmitted from a 
transmitting-side terminal. The other is the type 
(communication type) for transferring the objects between 
transmitting and receiving terminals (terminals A and B) . 

For example , it is possible to use RTP as a transmission 
method in the case of internet. Control information is 
transmitted by using a transmission channel referred to as 
LCNO in the case of the standard for video telephones. In 
the case of the example in Figure 11(a), a plurality of 
transmission channels are used for transmission. The same 
program channel (program ID) is assigned to these channels. 

Figure 11(b) is an illustration for explaining how to 
realize a protocol for realizing the functions described 
for the present invention. The transmission protocol 
(H.245) used for the video-telephone standards (H.324 and 
H.323) is described below. The functions described for the 
present invention are realized by extending H.245. 

The description method shown by the example in Figure 
11(b) is the protocol description method referred to as 
ASN.l. "Terminal Capability Set" expresses the 
performance of a terminal. In the case of the example in 
Figure 11(b) , the function described as "mpeg4 Capability" 
is extended for the conventional H.245. 



- 60 - 



In Figure 12 , "mpeg4 Capability" describes the maximum 
number of pictures "Max Number Of pictures" and the maximum 
number of audio ("Max Number Of Audio") which can be 
simultaneously processed by a terminal and the maximum 
number of multiplexing functions ("Max Number Of Mux") 
which can be realized by a terminal. 

In Figure 12 , these are expressed as the maximum number 
of objects ("Number Of Process Object") which can be 
processed. Moreover, a flag showing whether a 
communication header (expressed as AL in Figure 12) can be 
changed is described. When the value of the flag is true, 
the communication header can be changed. To communicate 
the number of objects which can be processed between 
terminals to each other by using "MPEG4 Capability", the 
communicated side returns "MEPG4 Capability Ack" to a 
terminal from which "MEPG4 Capability" is transmitted if 
the communicated side can accept (process) the objects but 
returns "MEPG4 Capability Reject" to the terminal if not. 

Figure 13(a) shows how to describe a protocol for using 
the above Group MUX for multiplexing a plurality of logical 
channels to one transmission channel (transmission channel 
of LAN in the case of this example) in order to share the 
transmission channel by logical channels. In the case of 
the example in Figure 13(a) , multiplexing means (Group MUX) 
is made to correspond to the transmission channel ( "LAN Port 
Number") of LAN (Local Area Network). "Group Mux ID" is 
an identifier for identifying the multiplexing means. To 
share the multiplexing means by terminals by using "Create 
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Group Mux" and perform communication between the terminals , 
the communicated side returns "Create Group Mux Ack" to a 
terminal from which "Create Group Mux" is transmitted if 
the side can accept (use) the multiplexing means but returns 
"Create Group Mux Reject" to the terminal if not. 
Separating means serving as means for performing an 
operation reverse to that of the multiplexing means can be 
realized by the same method. 

In Figure 13(b) , a case of deleting already-generated 
multiplexing means is described. 

In Figure 13(c) , the relation between the transmission 
channel of LAN and a plurality of logical channels is 
described. 

The transmission channel of LAN is described in 
accordance with "LAN Port Number" and the logical channels 
are described in accordance with "Logical Port Number". 

In the case of the examples in Figure 13(c), it is 
possible to make the transmission channel of one LAN 
correspond to up to 15 logical channels. 

In Figure 13, when the number of MUXs that can be used 
is only one, Group Mux ID is unnecessary. Moreover, to use 
a plurality of Muxes, Group Mux ID is necessary for each 
command of H.223. Furthermore, it is possible to use a flag 
for communicating the relation between ports used between 
the multiplexing means and separating means . Furthermore, 
it is possible to use a command making it possible to select 
whether to multiplex control information or transmit the 
information through another logical transmission line. 
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In the case of the explanation in Figures 13(a) to 13(c), 
the transmission channel uses LAN. However, it is also 
possible to use a system using no internet protocol like 
H.223 or MPEG2. 

In Figure 14, "Open Logical Channel" shows the protocol 
description for defining the attribute of a transmission 
channel. In the case of the example in Figure 14, "MPEG4 
Logical Channel Parameters" is extended and defined for the 
protocol of H. 245. 

Figure 15 shows that a program number (corresponding 
to a TV channel) and a program name are made to correspond 
to the transmission channel of LAN ("MPEG4 Logical Channel 
Parameters" ) . 

Moreover, in Figure 15, "Broadcast Channel Program" 
denotes a description method for transmitting the 
correspondence between LAN transmission channel and 
program number in accordance with the broadcasting type. 
The example in Figure 15 makes it possible to transmit the 
correspondence between up to 1,023 transmission channels 
and program numbers. Because transmission is unilaterally 
performed from the transmitting side to the receiving side 
in the case of broadcasting, it is necessary to cyclically 
transmit these pieces of information by considering the 
loss during transmission. 

In Figure 16(a), the attribute of an object (e.g. 
picture or audio) to be transmitted as a program is described 
( "MPEG 4 Object Classdef inition" ) . Object information 
("Object Structure Element") is made to correspond to a 
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program identifier ("Program ID") . It is possible to make 
up to 1,023 objects correspond to program identifiers. As 
the object information, a LAN transmission channel ("LAN 
Port Number"), a flag showing whether scramble is used 
("Scramble Flag") , a field for defining an offset value for 
changing the processing priority when a terminal is 
overloaded ( "CGD Offset), and an identifier (Media Type) 
for identifying a type of the media (picture or audio) to 
be transmitted are described. 

In the case of the example in Figure 16(b) , AL (in this 
case, defined as additional information necessary to decode 
pictures for one frame) is added to control decoding of ES 
(in this case, defined as a data string corresponding to 
pictures for one frame) . As AL information, the following 
are defined. 

(1) Random Access Flag (flag showing whether to be 
independently reproducible, true for an intra-frame 
encoded picture frame) 

(2) Presentation Time Stamp (time displayed by frame) 

(3) CGD Priority (Value of priority for deciding 
processing priority when terminal is overloaded) 

The example shows a case of transmitting the data 
string for one frame by using RTP (protocol for transmitting 
continuous media through internet, Realtime Transfer 
Protocol). "AL Reconfiguration" is a transmission 
expression for changing the maximum value that can be 
expressed by the above AL. 
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The example in Figure 16(b) makes it possible to 
express up to 2 bits as "Random Access Flag Max Bit". For 
example, when there is no bit f Random Access Flag is not 
used. When there are two bits, the maximum value is equal 
to 3. 

Moreover, the expression with a real number part and 
a mantissa part is allowed (e.g. 3 ~ 6). When no data is 
set, an operation under the state decided by default is 
allowed. 

In Figure 17, "Setup Request" shows a transmission 
expression for transmitting a setup time. "Setup Request" 
is transmitted before a program is transmitted, a 
transmission channel number ("Logical Channel Number") to 
be transmitted, a program ID ("execute Program Number" ) to 
be executed, a data ID ("data Number") to be used, and the 
ID of a command ( "execute Command Number" ) to be executed 
are made to correspond to each other and transmitted to a 
receiving terminal. Moreover, an execution authorizing 
flag ("flag"), a counter ("counter") describing whether to 
start execution when receiving Setup Request how many times , 
and a timer value (" timer") showing whether to start 
execution after how many hours pass can be used as other 
expression methods by making them correspond to 
transmission channel numbers. 

Rewriting of AL information and securing of rise time 
of Group Mux are listed as examples of requests to be 
demanded . 
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Figure 18 is an illustration for explaining a 
transmission expression for communicating whether to use 
the AL described for Figure 16(b) from a transmitting 
terminal to a receiving terminal ( "Control AL definition" ) . 

In Figure 18, if "Random Access Flag Use" is true, 
Random Access Flag is used. If not, it is not used. It 
is possible to transmit the AL change notification as 
control information through a transmission channel 
separate from that of data or transmit it through the 
transmission channel same as that of data together with the 
data. 

A decoder program is listed as a program to be executed. 
Moreover, a setup request can be used for broadcasting and 
communication. Furthermore, which item serving as control 
information is used as Al information is designated to a 
receiving terminal in accordance with the above request. 
Furthermore, it is possible to designate which item is used 
as communication header, which item is used as AL 
information and which item is used as control information 
to a receiving terminal. 

Figure 19(a) shows the example of a transmission 
expression for changing the structure of header information 
(data control information, transmission control 
information, and control information) to be transmitted by 
using an information frame identifier ("header ID") between 
transmitting and receiving terminals in accordance with the 
purpose . 
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In Figure 19(a), "class ES header" separates the 
structure of th data control information to be transmitt d 
through a transmission channel same as that of data from 
that of the information with which transmission control 
information is transmitted between transmitting and 
receiving terminals in accordance with an information frame 
identifier. 

For example, only the item of "buffer Size ES" is used 
when the value of "header ID" is 0 but the item of "reserved" 
is added when the value of "header ID" is 1 . 

Moreover, by using a default identifier ("use Header 
Extension"), it is decided whether to use a default-type 
information frame. When "use Header Extension" is true, 
an item in an if -statement is used. It is assumed that these 
pieces of structural information are previously decided 
between transmitting and receiving terminals. 
Furthermore, it is possible to use a structure for using 
either of an information frame identifier and a default 
identifier. 

In Figure 19(b), "AL configuration" shows an example 
for changing the structure of control information to be 
transmitted through a transmission channel different from 
that of data between transmitting and receiving terminals 
in accordance with the purpose. The usage of an information 
frame identifier and that of a default identifier are the 
same as the case of Figure 19(a). 

In the cas of the present invention, methods for 
realizing a system for simultaneously synth sizing and 
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displaying a plurality of pictures and a plurality of audio 
are specifically described from the following viewpoints. 

(1) A method for transmitting (communicating and 
broadcasting) a picture and an audio through a plurality 
of logical transmission lines and controlling them. 
Particularly, a method for respectively transmitting 
control information and data through an independent logical 
transmission line is described. 

(2) A method for dynamically changing header 
information (AL information) added to the data for a picture 
or audio to be transmitted. 

(3) A method for dynamically changing communication 
header information added for transmission. 

Specifically, for Items (2) and (3), a method for 
uniting and controlling the information overlapped in AL 
information and communication header and a method for 
transmitting AL information as control information are 
described. 

(4) A method for dynamically multiplexing and 
separating a plurality of logical transmission lines and 
transmitting information. 

A method for economizing the number of channels of 
transmission lines and a method for realizing efficient 
multiplexing are described. 

(5) A method for reading a program or data and 
transmitting pictures and audio considering a rise time. 
Moreover, a method for reducing an apparent setup time for 
various functions and purposes is described. 
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(6) A method for transmitting a picture or audio for 
zapping. 

The present invention is not restricted to only 
synthesis of two-dimensional pictures . It is also possible 
to use an expression method of combining a two-dimensional 
picture with a three-dimensional picture or include a 
picture synthesizing method for synthesizing a plurality 
of pictures so that they are adjacent to each other like 
a wide-visual-field picture (panoramic picture) . 

Moreover , the present invention does not purpose only 
such communication systems as bidirectional CATV and B- 
ISDN. For example , it is possible to use radio waves (e.g. 
VHF band or UHF band) or a broadcasting satellite for 
transmission of pictures and audio from a center-side 
terminal to a home-side terminal and an analog telephone 
line or N-ISDN for transmission of information from a 
home-side terminal to a center-side terminal (it is not 
always necessary that pictures, audio, and data are 
multiplexed) . 

Moreover, it is possible to use a communication system 
using radio such as IrDA, PHS (Personal Handy Phone), or 
radio LAN. Furthermore, a purposed terminal can be a 
portable terminal such as a portable information terminal 
or a desktop terminal such as a setup BOX or personal 
computer. Furthermore, a video telephone, multipoint 
monitoring system, multimedia database retrieval system, 
and game are listed as application fields. The present 
invention includes not only a receiving terminal but also 
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a server and a repeater to be connected to a receiving 
terminal . 

Furthermore , in the case of the above examples, a 
method for avoiding the overlap of the (communication) 
header of RTP with AL information and a method for extending 
the communication header of RTP or AL information are 
described. However, it is not always necessary for the 
present invention to use RTP. For example, it is also 
possible to newly define an original communication header 
or AL information by using UDP or TCP. Though an internet 
profile uses RTP sometimes, a multifunctional header such 
as RTP is not defined for a Raw profile. There are four 
types of concepts about AL information and communication 
header as described above. 

Thus, by dynamically deciding the information frame 
of data control information, transmission control 
information, or control information used by the 
transmitting and receiving terminals (e.g. information 
frame including the sequence of information to be added and 
the number of bits for firstly assigning a random access 
flag as 1-bit flag information and secondly assigning 16 
bits in the form of a sequence number), it is possible to 
change only an information frame corresponding to the 
situation in accordance with the purpose or transmission 
line. 

The frame of each piece of information can be any one 
of the frames already shown in Figures 6(a) to 6(d) and in 
the case of RTP, the data control information (AL) can be 
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the header information for each medium (e.g. in the case 
of H.263, the header information of the video or that of 
the payload intrinsic to H.263), transmission control 
information can be the header information of RTP, and 
control information can be the information for controlling 
RTP such as RTCP. 

Moreover, in the case of a publicly-known information 
frame previously set between transmitting and receiving 
terminals, by providing a default identifier for showing 
whether to process information by transmitting and 
receiving for data control information, transmission 
control information, and control information (information 
transmitted through a packet different from that of data 
to control terminal processing) respectively, it is 
possible to know whether information frames are changed. 
By setting the default identifier and communicating the 
changed content (such as change of time stamp information 
from 32 to 16 bits) only when change is performed in 
accordance with the method shown in Figure 16, it is 
prevented to transmit unnecessary configuration 
information even when frame information of information is 
not changed. 

For example, the following two methods are considered 
to change information frames of data control information. 
First, to describe a method for changing information frames 
of data control information in data, the default identifier 
(to be written in a fixed region or position) of the 
information present in the data described for the 



information frame of data control information is set and 
then, information frame change contents are described. 

To change information frames of data control 
information by describing a method for changing only the 
information frames of data in the control information 
(information frame control information) as another method , 
a default identifier provided for control information is 
set, the contents of the information frames of the data 
control information to be changed are described, and it is 
communicated to a receiving terminal in accordance with 
ACK/Reject and confirmed that the information frames of the 
data control information are changed and thereafter, the 
data in which information frames are changed is transmitted. 
Information frames of transmission control information and 
control information can be also changed in accordance with 
the above two methods (Figure 19). 

More specifically, though the header information of 
MPEG2 is fixed, by providing a default identifier for a 
program map table (defined by PSI) for relating the video 
stream of MPEG2-Ts (transport stream) with the audio stream 
of it and defining a configuration stream in which a method 
for changing frames of the information for the video stream 
and audio stream is described, it is possible to first 
interpret the configuration stream and then, interpret the 
headers of the video and audio streams in accordance with 
the content of the configuration stream when the default 
identifier is set. It is possible for the configuration 
stream to have the contents shown in Figure 19. 
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The contents ( transmitted-f ormat information) of the 
present invention about a transmission method and/or a 
structure of the data to be transmitted correspond to, for 
example, an information frame in the case of the above 
embodiment . 

Moreover, for the above embodiments, a case of 
transmitting the contents to be changed concerned with a 
transmission method and/or the structure of the data to be 
transmitted is mainly described. However, it is also 
possible to use a structure for transmitting only the 
identifier for the contents. In this case, as shown in 
Figure 44, it is also possible to use an audio-video 
transmitter provided with (1) transmitting means 5001 for 
transmitting the content concerned with a transmission 
method and/or the structure of the data to be transmitted 
or an identifier showing the content as the 
transmitted-format information through the transmission 
line same as that of the data to be transmitted or a 
transmission line different from the former transmission 
line and (2) storing means 5002 for storing a plurality of 
types of the contents concerned with the transmission 
method and/or the structure of the data to be transmitted 
and a plurality of types of identifiers for the contents, 
in which the identifiers are included in at least one of 
the data control information, transmission control 
information, and information for controlling terminal-side 
processing. Moreover, as shown in Figure 45, it is possible 
to use an audio-video receiver provided with r ceiving 
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means 5101 for receiving the transmission format 
information transmitted from the audio-video transmitter 
and transmission information interpreting means 5102 for 
interpreting the received transmission format information. 
Furthermore, the audio-video receiver can be constituted 
with storing means 5103 for storing a plurality of types 
of contents concerned with the transmission method and/or 
the structure of the data to be transmitted and a plurality 
of types of identifiers for the contents to use the contents 
stored in the storing means to interpret the contents of 
the identifiers when receiving the identifiers as the 
transmission format information. 

More specifically, by preparing a plurality of types 
of information frames previously determined between 
transmitting and receiving terminals and transmitting 
identifiers for the above information frames and 
information frame identifiers for a plurality of types of 
data control information, a plurality of types of 
transmission control information, and a plurality of types 
of control information (information-frame control 
information) together with data or as control information, 
it is possible to identify a plurality of types of data 
control information, a plurality of types of transmission 
control information, and a plurality of types of control 
information and optionally select the information frame of 
each type of information in accordance with the type of a 
medium to be transmitted or the size of a transmission line. 
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Identifiers of the present invention correspond to the 
above information frame identifiers. 

It is possible to read and interpret these information 
identifiers and default identifiers even if information 
frames are changed at a receiving-side terminal by adding 
the identifiers to a predetermined fixed- length region or 
predetermined position of the information to be 
transmitted. 

Moreover, in addition to the structures described for 
the above embodiments , it is possible to use a structure 
for temporarily selecting the caption picture of a program 
to be looked and listened by the user and showing it for 
the user when it takes a lot of time to set up a necessary 
program or data by using a broadcasting channel for 
broadcasting only the heading pictures of pictures 
broadcasted through a plurality of channels and switching 
programs to be looked and listened by the user. 

As described above, the present invention makes it 
possible to change frames of the information corresponding 
to the situation in accordance with the purpose or 
transmission line by dynamically determining the frame of 
data control information, transmission control information, 
or control information used by transmitting and receiving 
terminals. 

Moreover, it is possible to know whether information 
frames are changed by providing a default identifier for 
showing whether to transmit or receive and process 
information by a publicly-known information frame 
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previously set between transmitting and receiving 
terminals for data control information, transmission 
control information, and control information respectively 
and it is possible to prevent unnecessary configuration 
information from being transmitted even if information 
frames of information are not changed by setting a default 
identifier and communicating changed contents only when 
change is performed. 

Furthermore, it is possible to identify a plurality 
of types of data control information, a plurality of types 
of transmission control information, and a plurality of 
types of control information by preparing a plurality. of 
information frames previously determined between 
transmitting and receiving terminals and transmitting 
information frame identifiers for identifying a plurality 
of types of data control information, a plurality of types 
of transmission control information, and a plurality of 
types of control information together with data or as 
control information and optionally select the information 
frame of each type of information in accordance with the 
type of a medium to be transmitted or the size of a 
transmission line. 

It is possible to read and interpret these information 
identifiers and default identifiers even if information 
frames are changed at a receiving-side terminal by adding 
the identifiers to a predetermined fixed-length region or 
predetermined position of the information to be 
transmitted. 
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Embodim nts of the present invention are described 
below by referring to the accompanying drawings. 

In this case, any one of the above-described problems 
(Bl) to (B3) is solved. 

A " picture ( or video)" used for the present invention 
includes both a static picture and a moving picture. 
Moreover, a purposed picture can be a two-dimensional 
picture such as a computer graphics (CG) or three- 
dimensional picture data constituted with a wire-frame 
model • 

Figure 25 is a schematic block diagram of the picture 
encoder and a picture decoder of an embodiment of the present 
invention. 

A transmission control section 4011 for transmitting 
or recording various pieces of encoded information is means 
for transmitting the information for coaxial cable, CATV, 
LAN, or modem. A picture encoder 4101 has a picture 
encoding section 4 012 for encoding picture information such 
as H.263, MPEG1/2, JPEG, or Huffman encoding and the 
transmission control section 4011. Moreover, a picture 
decoder 4102 has an output section 4016 constituted with 
a reception control section 4013 for receiving various 
pieces of encoded information, a picture decoding section 
4014 for decoding various pieces of received picture 
information, a picture synthesizing section 4015 for 
synthesizing one decoded picture or more, and an output 
section 4016 constituted with a display and a printer for 
outputting pictures. 
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Figure 26 is a schematic block diagram of the audio 
encoder and an audio decoder of an embodiment of the present 
invention. 

An audio encoder (sound encorder) 4 201 is constituted 
with a transmission control section 4021 for transmitting 
or recording various pieces of encoded information and an 
audio encoding section 4022 for encoding such audio 
information such as G.721 or MPEG1 audio. Moreover, an 
audio decoder(a sound decoder) 4202 is constituted with a 
reception control section 4 023 for receiving various pieces 
of encoded information , an audio decoding section 4024 for 
decoding the above pieces of audio information , an audio 
synthesizing section (a sound synthesizing section) 4025 
for synthesizing one decoded audio or more, and output means 
4 026 for outputting audio. 

Time-series data for audio or picture is specifically 
encoded or decoded by the above encoder or decoder. 

The communication environments in Figures 25 and 26 
can be a communication environment in which a plurality of 
logical transmission lines can be used without considering 
multiplexing means like the case of internet or a 
communication environment in which multiplexing means must 
be considered like the case of an analog telephone or 
satellite broadcasting. Moreover, a system for 
bilaterally transferring a picture or audio between 
terminals like a video telephone or video conference or a 
system for broadcasting a broadcasting-type pictur or 



audio on satellite broadcasting, CATV, or internet is 
listed as a terminal connection system. 

Moreover, a method for synthesizing a picture and audio 
can be defined by describing a picture and an audio, 
structural information for a picture and an audio (display 
position and display time) , an audio-video grouping method, 
a picture display layer (depth), and an object ID (ID for 
identifying each object such as a picture or audio) and the 
relation between the attributes of them with a script 
language such as JAVA, VRML, or MHEG. A script describing 
a synthesizing method is obtained from a network or local 
memory . 

Moreover, it is possible to constitute a transmitting 
or receiving terminal by optionally combining an optional 
number of picture encoders, picture decoders, audio 
encoders, and audio decoders. 

Figure 27(a) is an illustration for explaining a 
priority adding section and a priority deciding section for 
controlling the priority for processing under overload. A 
priority adding section 31 for deciding the priority for 
processing encoded information under overload in 
accordance with a predetermined criteria by an encoding 
method such as H.263 or G.723 and relating the encoded 
information to the decided priority is provided for the 
picture encoder 4101 and audio encoder 4201. 

The criteria for adding a priority are scene change 
in the case of a picture and audio and audioless blocks in 



the case of a picture frame, stream, or audio designated 
by an editor or us r. 

A method for adding a priority to a communication 
header and a method for embedding a priority in the header 
of a bit stream to be encoded of a video or audio under 
encoding are considered as priority adding methods for 
defining a priority under overload. The former method 
makes it possible to obtain the information concerned with 
a priority without decoding the information and the latter 
method makes it possible to independently handle a single 
bit stream without depending on a system. 

As shown in Figure 27(b), when priority information 
is added to a communication header and one picture frame 
(e.g. intra-frame encoded I-frame or inter-frame encoded 
P- or B-frame) is divided into a plurality of transmission 
packets , a priority is added only to a communication header 
for transmitting the head of a picture frame accessible as 
single information in the case of a picture (when priorities 
are equal in the same picture frame, it is possible to assume 
that the priorities are not changed until the head of the 
next accessible picture frame appears). 

Moreover, in the case of a decoder, a priority deciding 
section 32 for deciding a processing method is provided for 
the picture decoder 4102 and audio decoder 4202 in 
accordance with the priorities of various pieces of encoded 
information received under overload. 

Figures 28(a) to 28(c) are illustrations for 
explaining the grading for adding a priority. Decoding is 
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performed by using two types of priorities for deciding the 
priority for processing under overload at a terminal. 

That is, a stream priority (Stream Priority; 
inter-time-series-data priority) for defining the priority 
for processing under overload in bit streams such as picture 
and audio and a frame priority (Frame Priority; intra- 
titae-series-data priority) for defining the priority for 
processing under overload in frames such as picture frames 
in the same stream are defined (see Figure 28(a)). 

The former stream priority makes it possible to handle 
a plurality of videos or audios. The latter frame priority 
makes it possible to add a different priority to a picture 
scene change or the same intra-frame encoded picture frame 
(I-frame) in accordance with the intention of an editor. 

A value expressed by the stream priority represents 
a case of handling it as a relative value and a case of 
handling it as an absolute value (see Figures 28(b) and 
28(c)). 

The stream and frame priorities are handled by a 
repeating terminal such as a router or gateway on a network 
and by transmitting and receiving terminals in the case of 
a terminal. 

Two types of methods for expressing an absolute value 
or relative value are considered. One of them is the method 
shown in Figure 28(b) and the other of them is the method 
shown in Figure 28(c). 

In Figure 28(b), the priority of an absolute value is 
a value showing the sequence in which picture streams (vid o 
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streams ) or audio streams added by an editor or mechanically 
added are proc ssed (or to be processed) under overload (but 
not a value considering the load fluctuation of an actual 
network or terminal) . The priority of a relative value is 
a value for changing the value of an absolute priority in 
accordance with the load of a terminal or network. 

By dividing a priority into a relative value and an 
absolute value to control the values and thereby changing 
only relative values at the transmitting side or by a 
repeater in accordance with the load fluctuation of a 
network or the like, it is possible to record the value of 
an absolute value into a hard disk or VTR while leaving the 
absolute priority added to a video or audio stream. Thus, 
when the value of the absolute priority is recorded, it is 
possible to reproduce a picture or audio that is not 
influenced by the load fluctuation of a network or the like. 
Moreover, it is possible to transmit a relative or absolute 
priority through a control channel independently of data. 

Moreover, in Figure 28(b), it is possible to fine the 
grading compared to a stream priority and handle a frame 
priority for defining the priority for frame processing 
under overload as the value of a relative priority or handle 
it as the value of an absolute priority. For example, by 
describing an absolute frame priority in encoded picture 
information and describing a relative frame priority 
corresponding to the absolute priority added to the picture 
frame in the communication header of a communication packet 
for transmitting encoded information in order to reflect 
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the load fluctuation of a network or terminal, it is possible 
to add a priority corresponding to the load of a network 
or terminal even at a frame level while leaving an original 
priority. 

Moreover, it is possible to transmit a relative 
priority by describing the relation with a frame not in a 
communication header but in a control channel independently 
of data. Thereby, it is possible to record data into a hard 
disk or VTR while leaving an absolute priority originally 
added to a picture or audio stream. 

Furthermore, in Figure 28(b), when reproducing data 
at a receiving terminal while transmitting the data through 
a network without recording the data at the receiving 
terminal, it is possible to compute the value of an absolute 
priority and that of a relative priority at frame and stream 
levels at the transmitting side and thereafter transmit 
only absolute values because it is unnecessary to control 
absolute and relative values by separating them from each 
other at a receiving terminal. 

In Figure 28(c), the priority of an absolute value is 
a value uniquely determined between frames obtained from 
the relation between Stream Priority and Frame Priority. 
The priority of a relative value is a value showing the 
sequence in which picture streams or audio streams added 
by an editor or mechanically added are processed (or to be 
processed) under overload. In the case of the example in 
Figure 28(c) , the frame priority of a picture or audio stream 
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(relative; relative value) and the stream priority for each 
stream are added. 

An absolute frame priority (absolute; absolute value) 
is obtained from the sum of a relative frame priority and 
a stream priority (That is, absolute frame priority « 
relative frame priority + stream priority). To obtain an 
absolute frame priority, it is also possible to use a 
subtracting method or a constant-multiplying method. 

An absolute frame priority mainly uses a network . This 
is because the expression using an absolute value does not 
require the necessity for deciding a priority for each frame 
through a repeater such as a router or gateway by considering 
Stream Priority and Frame Priority. By using the absolute 
frame priority, such processing as disuse of a frame by a 
repeater is simplified. 

Moreover, it can be expected to apply a relative frame 
priority mainly to an accumulation system for performing 
recording or editing. In the case of an editing operation, 
a plurality of picture and audio streams may be handled at 
the same time. In this case, the number of picture streams 
or the number of frames that can be reproduced may be limited 
depending on the load of a terminal or network. 

In the above case, it is unnecessary to recalculate 
every Frame Priority differently from the case in which an 
absolute value is expressed only by separating Stream 
Priority from Frame Priority, that is, only by changing 
Stream Priority of a stream which an editor wants to 
preferentially display or a user wants to see. Thus, it 
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is necessary to use an absolute expression or a relative 
xpression in accordance with the purpose. 

By describing whether to use a stream priority as a 
relative value or absolute value, it is possible to 
effectively express a priority for transmission and 
accumulation. 

In the case of the example in Figure 28(b), it is 
differentiated by following a stream priority that the 
value expressed by the stream priority is a relative value 
or absolute value by using a flag or identifier for 
expressing whether the value expressed by the stream 
priority is an absolute value or relative value. In the 
case of a frame priority, a flag or identifier is unnecessary 
because a relative value is described in a communication 
header and an absolute value is described in an encoded 
frame. 

In the case of the example in Figure 28(c), a flag or 
identifier for identifying whether a frame priority is an 
absolute value or relative value is used. In the case of 
an absolute value, the frame priority is a priority 
calculated in accordance with a stream priority and a 
relative frame priority and therefore, the calculation is 
not performed by a repeater or terminal. Moreover, when 
the calculation formula is already known at a terminal, it 
is possible to inversely calculate a relative frame 
priority from an absolute frame priority and a stream 
priority. For example, it is also possible to obtain the 
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absolute priority (Access Unit Priority) of a packet to be 
transmitted from the relational expression 

"Access Unit Priority = stream priority - frame 
priority" . 

In this case, it is also possible to express the frame 
priority as a degradation priority because it is obtained 
after being subtracted from the stream priority. 

Moreover, it is also possible to control data 
processing by relating one stream priority or more to the 
priority for processing of the data passing through the 
logical channel of TCP/IP (port No. of LAN) . 

Furthermore, it is expected that the necessity for 
retransmission can be reduced by assigning a stream 
priority or frame priority lower than that of a character 
or control information to a picture or audio. This is 
because no problem occurs in most cases even if a part of 
a picture or audio is lost. 

Figure 29 is an illustration for explaining a method 
for assigning a priority to multi-resolution video data. 

When one stream is constituted with a plurality of 
substreams, it is possible to define a substream processing 
method by adding a stream priority to the substreams and 
describing a logical sum or logical product under 
accumulation or transmission. 

In the case of a wavelet, it is possible to decompose 
one picture frame into a plurality of different-resolution 
picture frames. Moreover, even in the case of a DCT-base 
encoding method, it is possible to decompose one picture 
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frame into a plurality of different-resolution picture 
frames by dividing the picture frame into a high-frequency 
component and a low- frequency component and encoding them. 

In addition to stream priorities added to a plurality 
of picture streams constituted with a series of decomposed 
picture frames, the relation between picture streams is 
defined with AND (logical product) and OR (logical sum) in 
order to describe the relation. Specifically, when the 
stream priority of a stream A is 5 and that of a stream B 
is 10 (the smaller a numerical value gets, the higher a 
priority becomes) , the relation between picture streams is 
defined that the stream B is disused in the case of disuse 
of stream data depending on the priority but the stream B 
is transmitted and processed without being disused even if 
the priority of the stream B is lower than the priority of 
a threshold in the case of AND by describing the relation 
between streams. 

Thereby, relevant streams can be processed without 
being disused. In the case of OR, it is defined that 
relevant streams can be disused. It is possible to perform 
disuse processing at a transmitting or receiving terminal 
or a repeating terminal as ever. 

Moreover, when the same video clip is encoded to 24 
Kbps and 48 Kbps respectively as an operator for relational 
description, there is a case in which either 24 or 48 Kbps 
may be reproduced (exclusive logical sum EX -OR as 
relational description) . 
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When the priority of the former is set to 10 and that 
of the latt r is set to 5, a user can reproduce the latter 
in accordance with a priority or select the latter without 
following the priority. 

Figure 30 is an illustration for explaining a 
communication pay load constituting method. 

When constituted with a plurality of substreams, 
disuse at a transmission packet level becomes easy by, for 
example, constituting transmission packets starting with, 
for example, one having the highest priority in accordance 
with a stream priority added to a substream. Moreover, 
disuse at a communication packet level becomes easy by 
fining grading and uniting the information for objects 
respectively having a high frame priority and thereby 
constituting a communication packet. 

By relating the sliced structure of a picture to a 
communication packet, return of a missing packet becomes 
easy. That is, by relating the sliced structure of a video 
to a packet structure, a re-sync marker for 
resynchronization is unnecessary. Unless a sliced 
structure coincides with the structure of a communication 
packet, it is necessary to add a re-sync marker (marker for 
making a returning position known) so that 
resynchronization can be performed if information is 
damaged due to a missing packet). 

In accordance with the above-mentioned, it is 
considered to apply a high error protection to a 
communication packet having a high priority . Moreover, the 
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sliced structure of a picture represents the unit of 
collected picture information such as GOB or MB. 

Figure 31 is an illustration for explaining a method 
for relating data to communication payload. By 
transmitting a method for relating a stream or object to 
a communication packet together with control information 
or data, it is possible to generate an optional data format 
in accordance with the communication state or purpose. For 
example, in the case of RTP (Real time Transfer Protocol), 
the payload of RTP is defined for each encoding to be handled. 
The format of the existing RTP is fixed. In the case of 
H.263, as shown in Figure 31, three data formats from Mode 
A to Mode C are defined. In the case of H.263, a 
communication payload purposing a multi-resolution picture 
format is not defined. 

In the case of the example in Figure 31, Layer No. and 
the above relational description (AND, OR) are added to the 
data format of Mode A and defined. 

Figure 32 is an illustration for explaining the 
relation between frame priority, stream priority, and 
communication packet priority. 

Moreover, Figure 32 shows an example of using a 
priority added to a communication packet on a transmission 
line as a communication packet priority and relating a 
stream priority and a frame priority to the communication 
packet priority. 

Generally, in the case of communication using IP, it 
is nec ssary to transmit data by relating a frame priority 
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or stream priority added to picture or audio data to the 
priority of a low-order IP packet. Because the picture or 
audio data is divided into IP packets and transmitted , it 
is necessary to relate priorities to each other. In the 
case of the example in Figure 32 , because the stream priority 
takes values from 0 to 3 and the frame priority takes values 
from 0 to 5, high-order data can take priorities from 0 to 
15. 

In the case of IPv6, priorities (4 bits) from 0 to 7 
are reserved for congestion-controlled traffic. 
Priorities from 8 to 15 are reserved for real-time 
communication traffic or not-congestion-controlled 
traffic. Priority 15 is the highest priority and priority 
8 is the lowest priority. This represents the priority at 
the packet level of IP. 

In the case of data transmission using IP, it is 
necessary to relate high-order priorities from 0 to 15 to 
low-order IP priorities from 8 to 15. To relate priorities 
to each other, it is possible to use a method of clipping 
some of high-order priorities or relate priorities to each 
other by using a performance function. Relating of 
high-order data with a low-order IP priority is performed 
at a repeating node (router or gateway) or transmitting and 
receiving terminals. 

Transmitting means is not restricted to only IP. It 
is possible to use a transmission packet having a flag 
showing whether it can be disused like TS (transport stream) 
of ATM or MPEG2. 
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The frame priority and stream priority having been 
described so far can be applied to a transmitting medium 
or data-recording medium. It is possible to use a floppy 
disk or optical disk as a data-recording medium. 

Moreover, it is possible to use not only the floppy 
disk or optical disk but also a medium such as an IC card 
or ROM cassette as long as a program can be recorded in the 
medium. Furthermore, it is possible to use an audio-video 
repeater such as a router or gateway for relaying data. 

Furthermore, preferential retransmission is realized 
by deciding time-series data to be retransmitted in 
accordance with the information of Stream Priority 
(inter-time-series-data priority) or Frame Priority 
(intra-time-series-data priority). For example, when 
decoding is performed at a receiving terminal in accordance 
with priority information, it is possible to prevent a 
stream or frame that is not an object for processing from 
being retransmitted. 

Furthermore, separately from a present priority to be 
processed, it is possible to decide a stream or frame having 
a priority to be retransmitted in accordance with the 
relation between retransmission frequency and successful 
transmission frequency. 

Furthermore, in the case of a transmitting-side 
terminal, preferential transmission is realized by 
deciding time-series data to be transmitted in accordance 
with the information of Str am Priority (inter-time- 
series-data priority) or Frame Priority (intra-time- 
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series-data priority). For example, by deciding the 
priority of a stream or frame to be transmitted in accordance 
with an average transfer rate or retransmission frequency , 
it is possible to transmit an adaptive picture or audio even 
when a network is overloaded. 

The above embodiment is not restricted to two- 
dimensional-picture synthesis. It is also possible to use 
an expression method obtained by combining a two- 
dimensional picture with a three-dimensional picture or 
include a picture-synthesizing method for synthesizing a 
plurality of pictures so as to be adjacent to each other 
like a wide-visual-field picture (panorama picture). 
Moreover, communication systems purposed by the present 
invention are not restricted to bidirectional CATV or 
B-ISDN. For example, transmission of pictures and audio 
from a center-side terminal to a house-side terminal can 
use radio waves (e.g. VHF band or UHF band) or satellite 
broadcasting and information origination from the 
house-side terminal to the center-side terminal can use an 
analog telephone line or N-ISDN (it is not always necessary 
that pictures, audio, or data are multiplexed) . Moreover, 
it is possible to use a communication system using radio 
such as an IrDA, PHS (Personal Handy Phone) or radio LAN. 

Furthermore, a purpose terminal can be a portable 
terminal such as a portable information terminal or a 
desktop terminal such as a set-top BOX or personal computer. 

As described above, the present invention makes it easy 
to handle a plurality of video streams and a plurality of 
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audio streams and mainly synchronize and reproduce 
important scene cut together with audio by reflecting the 
intention of an editor. 

An embodiment of the present invention is described 
below by referring to the accompanying drawings. 

The embodiment described below solves any one of the 
above problems (CI) to (C3). 

Figure 33 shows the structure of the transmitter of 
the first embodiment. Symbol 2101 denotes a picture-input 
terminal and the size of a sheet of picture has 144 pixels 
by 176 pixels. Symbol 2102 denotes a video encoder that 
is constituted with four components 1021, 1022, 1023, and 
1024 (see Recommendation H.261). 

Symbol 1021 denotes a switching unit for dividing an 
input picture into macroblocks (a square region of 16 pixels 
by 16 pixels) and deciding whether to intra-encode or 
inter-encode the blocks and 1022 denotes movement 
compensating means for generating a movement compensating 
picture in accordance with the local decoded picture which 
can be calculated in accordance with the last-time encoding 
result, calculating the difference between the movement 
compensating picture and an input picture, and outputting 
the result in macroblocks. Movement compensation includes 
halfpixel prediction having a long processing time and 
fullpixel prediction having a short processing time. 
Symbol 1023 denotes orthogonal transforming means for 
applying DCT transformation to each macroblock and 1024 
denotes variable-length-encoding means for applying 
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entropy encoding to the DCT transformation result and other 
encoded information . 

Symbol 2103 denotes counting means for counting 
execution frequencies of four components of the video 
encoder 2102 and outputting the counting result to 
transforming means every input picture. In this case, the 
execution frequency of the halfpixel prediction and that 
of the fullpixel prediction are counted from the movement 
compensating means 1022. 

Symbol 2104 denotes transforming means for outputting 
the data string shown in Figure 34. Symbol 2105 denotes 
transmitting means for multiplexing a variable-length code 
sent from the video encoder 2102 and a data string sent from 
the transforming means 2104 into a data string and 
outputting the data string to a data output terminal 2109. 

According to the above structure f it is possible to 
transmit the execution frequencies of indispensable 
processing (switching unit 1021 , orthogonal transforming 
means 1023 , and variable-length encoding means 1024) and 
dispensable processing (movement compensating means 1022) 
to a receiver. 

The transmitter of the first embodiment corresponds 
to claim 68. 

Figure 40 is a flowchart of the transmitting method 
of the second embodiment. 

Because operations of this embodiment are similar to 
those of the first embodiment , corresponding elements are 
added. A picture is input in step 801 (picture input 



- 94 - 



terminal 2101) and the picture is divided into macroblocks 
in step 802. H r after, processings from step 803 to step 
806 are repeated until the processing corresponding to 
every macroblock is completed in accordance with the 
conditional branch in step 807. Moreover, when each 
processing is executed so that frequencies of the 
processings from step 803 to step 806 can be recorded in 
specific variables, a corresponding variable is 
incremented by 1. 

First, it is decided whether to intra-encode or 
inter-encode a macroblock to be processed in step 803 
( switching unit 1021). When inter-encoding the macroblock, 
movement compensation is performed in step 804 (movement 
compensating means 1022) . Thereafter, DCT transformation 
and variable-length encoding are performed in steps 805 and 
806 (orthogonal transforming means 1023 and variable- 
length encoding means 1024). When processing for every 
macroblock is completed (in the case of Yes in step 807), 
the variable showing the execution frequency corresponding 
to each processing is read in step 808, the data string shown 
in Figure 2 is generated, and the data string and a code 
are multiplexed and output. The processings from step 801 
to step 808 are repeatedly executed as long as input pictures 
are continued. 

The above structure makes it possible to transmit the 
execution frequency of each processing. 

Th transmitting method of the second embodiment 
corr sponds to claim 67. 
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Figure 35 shows the structure of the receiver of the 
third embodiment. 

In Figure 35, symbol 307 denotes an input terminal for 
inputting the output of the transmitter of the first 
embodiment and 301 denotes receiving means for fetching a 
variable-length code and a data string through inverse 
multiplexing in accordance with the output of the 
transmitter of the first embodiment and outputting them. 
In this case, it is assumed that the time required to receive 
the data for one sheet is measured and also output. 

Symbol 303 denotes a decoder for a video using a 
variable-length code as an input, which is constituted with 
five components. Symbol 3031 denotes variable-length 
decoding means for fetching a DCT coefficient and other 
encoded information from a variable-length code, 3032 
denotes inverse orthogonal transforming means for applying 
inverse DCT transformation to a DCT coefficient, and 3033 
denotes a switching unit for switching an output to upside 
or downside every macroblock in accordance with the encoded 
information showing whether the macroblock is intra- 
encoded or inter-encoded. Symbol 3034 denotes movement 
compensating means for generating a movement compensating 
picture by using the last-time decoded picture and movement 
encoded information, and adding and outputting the outputs 
of the inverse orthogonal transforming means 3032. Symbol 
3035 denotes execution-time measuring means for measuring 
and outputting the execution time until decoding and 



outputting of a picture is completed after a variable- 
length code is input to the decoder 303. 

Symbol 302 denotes estimating means for receiving the 
execution frequency of each element (variable-length 
decoding means 3031 , inverse orthogonal transforming means 
3032 , switching unit 3 033, or movement compensating means 
3034) from a data string sent from the receiving means 301 
and execution time from the execution-time measuring means 
3035 to estimate the execution time of each element. 

To estimate the execution time of each element, it is 
possible to use the linear regression and assume an 
estimated execution time as a purposed variable y and the 
execution frequency of each component as an explanatory 
variable xui. In this case, it may be possible to regard 
a regression parameter aoi as the execution time of each 
element. Moreover, in the case of linear regression, it 
is necessary to accumulate much-enough past data and 
resultantly, many memories are wasted. However, to avoid 
many memories from being wasted, it is also possible to use 
the estimation of an internal-state variable by a Kalman 
filter. It is possible to consider the above case as a case 
in which an observed value is assumed as an execution time, 
the execution time of each element is assumed as an 
internal-state variable, and an observation matrix C 
changes every step due to the execution frequency of each 
element. Symbol 304 denotes frequency reducing means for 
changing the execution frequ ncy of each element so as to 
reduce the execution frequency of fullpixel prediction and 
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increase the execution frequency of halfpixel prediction 
by a corresponding value. The method for calculating the 
corresponding value is shown below. 

First, the execution frequency and estimated execution 
time of each element are received from the estimating means 
302 to estimate an execution time. When the execution time 
(exceeds the time required to receive the data from the 
receiving means 301 , the execution frequency of fullpixel 
prediction is increased and the execution frequency of 
halfpixel prediction is decreased until the former time 
does not exceed the latter time. Symbol 306 denotes an 
output terminal for a decoded picture. 

Moreover, there is a case in which the movement 
compensating means 3034 is designated so as to perform 
halfpixel prediction in accordance with encoded 
information. In this case, when the predetermined 
execution frequency of halfpixel prediction is exceeded, 
a halfpixel movement is rounded to a fullpixel movement to 
execute fullpixel prediction. 

According to the above-described first and third 
embodiments, the execution time of decoding is estimated 
in accordance with the estimated execution time of each 
element and, when the decoding execution time may exceed 
the time (designated time) required to receive the data for 
one sheet, halfpixel prediction having a long execution 
time is replaced with fullpixel prediction. Thereby, it 
is possible to prevent an execution time from xceeding a 
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designated time and solve the problem (CI) (corresponding 
to claims 68 and 74). 

Moreover, a case of regarding the parts of 
indispensable and dispensable processings as two groups 
corresponds to claims 66 and 72 and a case of regarding the 
part of a video as waveform data corresponds to claims 64 
and 70. 

Furthermore, by using no high-frequency components in 
the IDCT calculation by a receiver, it is possible to reduce 
the processing time for the IDCT calculation. That is, by 
regarding the calculation of low-frequency components as 
indispensable processing and the calculation of high- 
frequency components as dispensable processing in the IDCT 
calculation, it is also possible to reduce the calculation 
frequency of high-frequency components in the IDCT 
calculation. 

Figure 41 is a flowchart of the receiving method of 
the fourth embodiment. 

Because operations of this embodiment are similar to 
those of the third embodiment, corresponding elements are 
added. In step 901, the variable a_i for expressing the 
execution time of each element is initialized (estimating 
means 302) . In step 902, multiplexed data is input and the 
time required for multiplexing the data is measured 
(receiving means 301). In step 903, the multiplexed data 
is divided into a variable-length code and a data string 
and output (receiving means 301). In step 904, each 
execution frequ ncy is fetched from a data string (Figure 
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2) and it is set to x_i. In step 905, an actual execution 
frequency is calculated in accordance with the ex cution 
time a_i of each element and each execution frequency x_i 
(frequency reducing means 304). In step 906, measurement 
of the execution time for decoding is started. In step 907 , 
a decoding routine to be described later is started. 
Thereafter, in step 908, measurement of the decoding 
execution time is ended (video decoder 303 and 
execution-time measuring means 3035). In step 908, the 
execution time of each element is estimated in accordance 
with the decoding execution time in step 908 and the actual 
execution frequency of each element in step 905 to update 
a_i (estimating means 302). The above processing is 
executed every input multiplexed data. 

Moreover, in step 907 for decoding routine, 
variable-length decoding is performed in step 910 
(variable-length decoding means 3031) , inverse orthogonal 
transformation is performed in step 911 (inverse orthogonal 
transforming means 3032), and processing is branched in 
step 912 in accordance with the information of the intra- 
/inter-processing fetched through the processing in step 
910 ( switching unit 3033 ) . In the case of inter-processing , 
movement compensation is performed in step 913 (movement 
compensating means 3034). In step 913, the execution 
frequency of halfpixel prediction is counted in step 913. 
When the counted execution frequency exceeds the actual 
execution frequency obtained in step 905, halfpixel 
prediction is replaced with fullpixel prediction for 
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execution. Aft r the above processing is applied to every 
macroblock (step 914), the routine is ended. 

According to the above-described second and fourth 
embodiments, the execution time of decoding is estimated 
in accordance with the estimated execution time of each 
element and, when the execution time may exceed the time 
required to receive the data for one sheet (designated time) , 
halfpixel prediction having a long execution time is 
replaced with fullpixel prediction. Thereby, it is 
possible to prevent an execution time from exceeding a 
designated time and solve the problem (CI) (corresponding 
to claims 67 and 73). 

Furthermore, a case of regarding the parts of 
dispensable and indispensable processings as two groups 
corresponds to claims 65 and 71 and a case of regarding the 
part of a video as waveform data corresponds to claims 63 
and 69. 

Figure 36 shows the structure of the receiver of the 
fifth embodiment. 

Most components of this embodiment are the same as 
those described for the second embodiment. However, two 
added components and one corrected component are described 
below. 

Symbol 4 02 denotes estimating means obtained by 
correcting the estimating means 302 described for the 
second embodiment so as to output the execution time of each 
element obtained as th result of estimation separately 
from an output to frequency limiting m ans 304. Symbol 408 
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d notes transmitting means for generating the data string 
shown in Figure 37 in accordance with the execution time 
of each element and outputting it. When expressing an 
execution time with 16 bits by using microsecond as the unit, 
up to approx . 65 msec can be expressed . Therefore , approx . 
65 msec will be enough. Symbol 409 denotes an output 
terminal for transmitting the data string to transmitting 
means . 

Moreover, a receiving method corresponding to the 
fifth embodiment can be obtained only by adding a step for 
generating the data string shown in Figure 37 immediately 
after symbol 808 in Figure 40. 

Figure 38 shows the structure of the transmitter of 
the sixth embodiment. 

Most components of this embodiment are the same as 
those described for the first embodiment. However, two 
added components are described below. Symbol 606 denotes 
an input terminal for receiving a data string output by the 
receiver of the third embodiment and 607 denotes receiving 
means for receiving the data string and outputting the 
execution time of each element. Symbol 608 denotes 
deciding means for obtaining the execution frequency of 
each element and its obtaining procedure is described below. 
First, every macroblock in a picture is processed by the 
switching unit 1021 to obtain the execution frequency of 
the switching unit 1021 at this point of time. Moreover, 
it is possible to uniquely decide execution frequencies by 
the movement compensating m ans 1022, orthogonal 
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transforming means 1023, and variable-length encoding 
means 1024 in accordance with the processing result up to 
this point of time. Therefore, the execution time required 
for decoding at the receiver side is estimated by using these 
execution frequencies and the execution time sent from the 
receiving means 607. The estimated decoding time is 
obtained as the total sum of the product between the 
execution time and execution frequency of each element 
every element. Moreover, when the estimated decoding time 
is equal to or more than the time required to transmit the 
number of codes (e.g. 16 Kbits) to be generated through this 
picture designated by a rate controller or the like (e.g. 
250 msec when a transmission rate is 64 Kbits/sec), the 
execution frequency of fullpixel prediction is increased 
and the execution frequency of halfpixel prediction is 
decreased so that the estimated decoding execution time 
does not exceed the time required for transmission. 
(Because fullpixel prediction has a shorter execution time, 
it is possible to reduce the execution time of fullpixel 
prediction by reducing the frequency of fullpixel 
prediction.) 

Moreover, the video encoder 2102 performs various 
processings in accordance with the execution frequency 
designated by the deciding means 608. For example, after 
the movement compensating means 1022 executes halfpixel 
prediction by the predetermined execution frequency of 
halfpixel prediction, it executes only fullpixel 
prediction. 
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Furthermore, it is possible to improve the selecting 
method so that half pixel prediction is uniformly dispersed 
in a picture. For example , it is possible to use a method 
of first obtaining every macroblock requiring halfpixel 
prediction , calculating the product (3) obtained by 
dividing the number of the above macroblocks (e.g. 12) by 
the execution frequency of halfpixel prediction (e.g. 4), 
and applying halfpixel prediction only to a macroblock 
whose sequence from the beginning of the macroblocks 
requiring halfpixel prediction is divided by the above 
product without a remainder ( 0 , 3 , 6 , or 9 ) . 

According to the above-described fifth and sixth 
embodiments, the execution time of each estimated element 
is transmitted to the transmitting side, the execution time 
of decoding is estimated at the transmitting side, and 
halfpixel prediction having a long execution time is 
replaced with fullpixel prediction so that the estimated 
decoding execution time does not exceed the time 
(designated time) probably required to receive the data for 
one sheet. Thereby, the information for halfpixel 
prediction among the sent encoded information is not 
disused and thereby, it is possible to prevent an execution 
time from exceeding a designated time and solve the problem 
(C2) (corresponding to claims 76 and 78). 

Moreover, in the case of dispensable processing, it 
is possible to divide inter-macroblock encoding into such 
three movement compensations as normal movement 
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compensation, 8x8 movement compensation, and overlap 
movement compensation. 

Figure 42 is a flowchart of the transmitting method 
of the seventh embodiment. 

Because operations of this embodiment are similar to 
those of the sixth embodiment, corresponding elements are 
added. In step 1001, the initial value of the execution 
time of each processing is set. A picture is input (input 
terminal 2101) in step 801 and it is divided into macroblocks 
in step 802. In step 1002, it is decided whether to 
intra-encode or inter-encode every macroblock (switching 
unit 1021). Resultantly, the execution frequency of each 
processing from step 1005 to step 806 is known. Therefore, 
in step 1003, an actual execution frequency is calculated 
in accordance with the above execution frequency and the 
execution time of each processing (deciding means 608). 

Hereafter, the processings from step 1005 to step 806 
are repeated until the processing for every macroblock is 
completed in accordance with the conditional branch in step 
807. 

Moreover, when each processing is executed, a 
corresponding variable is incremented by 1 so that the 
processing frequencies from step 1005 to step 806 can be 
recorded in a specific variable. First, in step 1005, 
branching is performed in accordance with the decision 
result in step 1002 (switching unit 1021) . In the case of 
inter-encoding, movement compensation is performed in step 
804 (movement compensating means 1022) . In this case, the 
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frequency of half pixel prediction is counted. When the 
counted frequency exceeds th actual frequency obtained in 
step 1003, fullpixel prediction is executed instead without 
executing halfpixel prediction. Thereafter, in steps 805 
and 806, DCT transformation and variable-length encoding 
are performed (orthogonal transforming means 1023 and 
variable-length encoding means 1024 ) . When the processing 
for every macroblock is completed, (in the case of Yes in 
step 807), the variable showing the execution frequency 
corresponding to each processing is read in step 808, the 
data string shown in Figure 2 is generated, and the data 
string and a code are multiplexed and output. In step 1004 , 
the data string is received and the execution time of each 
processing is fetched from the data string and set. 

Processings from step 801 to step 1004 are repeatedly 
executed as long as pictures are input. 

According to the paragraph beginning with the final 
'"Moreover" of the descriptive portion of the fifth 
embodiment and the seventh embodiment, the estimated 
execution time of each element is transmitted to the 
transmitting side, the execution time of decoding is 
estimated at the transmitting side, and halfpixel 
prediction having a long execution time is replaced with 
fullpixel prediction so that the estimated decoding 
execution time does not exceed the time (designated time) 
probably required to receive the data for one sheet. 
Thereby, the information for halfpixel prediction among the 
sent encoded information is not disused and it is possible 
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to prevent the execution time from exceeding the designat d 
time and solve the problem (C2) (corresponding to claims 
75 and 77) . 

Figure 39 shows the structure of the transmitting 
apparatus of the eighth embodiment of the present 
invention. 

Most components of this embodiment are the same as 
those described for the first embodiment. Therefore , four 
added components are described below. 

Symbol 7010 denotes execution-time measuring means for 
measuring the execution time until encoding and outputting 
of a picture are completed after the picture is input to 
an encoder 2102 and outputting the measured execution time. 
Symbol 706 denotes estimating means for receiving execution 
frequencies of elements (switching unit 1021, movement 
compensating means 1022 r orthogonal transforming means 
1023 , and variable-length decoding means 1024) of a data 
string from counting means 2103 and the execution time from 
the execution-time measuring means 7010 and estimating the 
execution time of each element. It is possible to use an 
estimating method same as that described for the estimating 
means 302 of the second embodiment. Symbol 707 denotes an 
input terminal for inputting a frame rate value sent from 
a user and 708 denotes deciding means for obtaining the 
execution frequency of each element. The obtaining 
procedure is described below. 

First, every macroblock in a picture is processed by 
the switching unit 1021 to obtain the execution frequency 
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of the switching unit 1021 at this point of time. 
Thereafter, it is possible to uniquely decide execution 
frequencies by the movement compensating means 1022, 
orthogonal transforming means 1023, and variable-length 
encoding means 1024 in accordance with the processing 
result up to this point of time. Then, the total sum of 
products between the execution frequency and the estimated 
execution time of each element sent from the estimating 
means 706 is obtained every element to calculate an 
estimated encoding time . Then, when the estimated encoding 
time is equal to or longer than the time usable for encoding 
of a sheet of picture obtained from the inverse number of 
the frame rate sent from symbol 707, the execution frequency 
of fullpixel prediction is increased and that of halfpixel 
prediction is decreased. 

By repeating the above change of execution frequencies 
and calculation of the estimated encoding time until the 
estimated encoding time becomes equal to or shorter than 
the usable time, each execution frequency is decided. 

Moreover, the video encoder 2102 performs various 
processings in accordance with the execution frequency 
designated by the deciding means 608. For example, after 
the movement compensating means 1022 executes halfpixel 
prediction by the predetermined execution frequency of 
halfpixel prediction, it executes only fullpixel 
prediction. 

Furth rmore, it is also possible to improve a selecting 
method so that halfpixel prediction is uniformly dispersed 
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in a picture. For example, it is possible to use a method 
of obtaining every macroblock requiring halfpix 1 
prediction, calculating the product (3) obtained by 
dividing the number of macroblocks requiring halfpixel 
prediction (e.g. 12) by the execution frequency of 
halfpixel prediction (e.g. 4), and applying halfpixel 
prediction only to a macroblock whose sequence from the 
beginning of the macroblocks requiring halfpixel 
prediction is divided by the product without remainder (0, 
3 , 6 r or 9 ) . 

The above eighth embodiment makes it possible to solve 
the problem (C3) by estimating the execution time of each 
processing, estimating an execution time required for 
encoding in accordance with the estimated execution time, 
and deciding an execution frequency so that the estimated 
encoding time becomes equal to or shorter than the time 
usable for encoding of a picture determined in accordance 
with a frame rate (corresponding to claim 80). 

Moreover, because the movement compensating means 1022 
detects a movement vector, there is a full-search 
movement-vector detecting method for detecting a vector for 
minimizing SAD (sum of absolute values of differences every 
pixel) among vectors in a range of 15 horizontal and vertical 
pixels. Furthermore, there is a three-step movement- 
vector detecting method (described in annex of H.261) . The 
three-step movement -vector detecting method executes the 
processing of selecting nine points uniformly arranged in 
the above retrieval rang to select a point having a minimum 
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SAD and then, selecting nine points again in a narrow range 
close to the above point to select a point having a minimum 
SAD one more time. 

It is also possible to properly decrease the execution 
frequency of the full-search movement -vector detecting 
method and properly increase the execution frequency of the 
three-step movement-vector detecting method by regarding 
these two methods as a dispensable processing method and 
estimating the execution time of each of the two methods, 
estimating an execution time required for encoding in 
accordance with the estimated execution time so that the 
estimated execution time becomes equal to or shorter than 
the time designated by a user. 

Moreover, it is possible to use a movement-vector 
detecting method using a fixed retrieval frequency and 
further simplifying the processing or a movement-vector 
detecting method of returning only the movement vector ( 0 , 
0) as a result together with the three-step movement-vector 
detecting method. 

Figure 43 is a flowchart of the transmitting method 
of the ninth embodiment. 

Because operations of this embodiment are similar to 
those of the eighth embodiment, corresponding elements are 
added. For the detailed operation in each flow, refer to 
the description of corresponding elements. 

Moreover, because this embodiment is almost the same 
as the second embodiment, only different points are 
explained below. 
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In step 1101, the initial value of the execution time 
of each processing is set to a variable a_i. Moreover, in 
step 1102, a frame rate is input (input terminal 707). In 
step 1103 , an actual execution frequency is decided in 
accordance with the frame rate and the execution time a_i 
of each processing in step 1102 and the execution frequency 
of each processing obtained from the intra-/inter- 
processing decision result in step 1002 (deciding means 
708). In steps 1105 and 1106, the execution time of 
encoding is measured. In step 1104, the execution time of 
each processing is estimated in accordance with the 
execution time obtained in step 1106 and the actual 
execution frequency of each processing to update the 
variable a_i (estimating means 706). 

According to the above-described ninth embodiment, the 
execution time of each processing is estimated and an 
execution time required for encoding is previously measured 
in accordance with the estimated execution time. Thus, it 
is possible to solve the problem (C3) by deciding an actual 
execution frequency so that the estimated encoding time 
becomes the time usable for the encoding of a picture 
determined in accordance with a frame rate or shorter 
(corresponding to claim 79). 

In the case of the second embodiment, it is also 
possible to add a two-byte region immediately after the 
start code shown in Figure 2 when the data string is 
generated in step 808 and add the binary notation of a code 
length to the region. 
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Moreover, in the case of the fourth mbodiment , it is 
also possible to extract a code length from the two-byt 
region when multiplexed data is input in step 902 and use 
the code transmission time obtained from the code length 
and the code transmission rate for the execution frequency 
calculation in step 905 (the execution frequency of 
halfpixel prediction is decreased so as not to exceed the 
code transmission time) . This corresponds to claims 81 and 
83. 

Furthermore , in the case of the first embodiment , it 
is also possible to add a two-byte region immediately after 
the start code shown in Figure 2 when a data string is 
generated in step 2104 and add the binary notation of a code 
length to the region. 

Furthermore, in the case of the third embodiment, it 
is also possible to extract a code length from the two- 
byte region when multiplexed data is input in step 301 and 
use a code transmission time obtained from the code length 
and the code transmission rate for the execution frequency 
calculation in step 304 (the execution frequency of 
halfpixel prediction is decreased so as not to exceed the 
code transmission time) . This corresponds to claims 82 and 
84. 

Furthermore, in the case of the fourth embodiment, an 
actual execution frequency of halfpixel prediction is 
recorded immediately after step 909 to calculate a maximum 
value. When th maximum value is equal to or less than a 
small-enough value (e.g. 2 or 3), it is also possible to 
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generate a data string (data string comprising a specific 
bit pattern) showing that half pixel prediction is not used 
and transmit the generated data string. Furthermore, in 
the case of the second embodiment , it is confirmed whether 
the data string is received immediately after step 808 and 
when the data string showing that halfpixel prediction is 
not used is received f it is also possible to make movement 
compensation processing always serve as fullpixel 
prediction in step 808. This corresponds to claims 93 and 
91. 

Furthermore, the above concept can be applied to cases 
other than movement compensation. For example, it is 
possible to reduce the DCT calculation time by using no 
high-frequency component for DCT calculation. That is, in 
the case of a receiving method, when the rate of the 
IDCT-calculation execution time to the entire execution 
time exceeds a certain value, a data string showing that 
the rate exceeds a certain value is transmitted to the 
transmitting side. When the transmitting side receives the 
data string, it is also possible to calculate only low- 
frequency components through the DCT calculation and 
decrease all high-frequency components to zero. This 
corresponds to claim 89. 

Furthermore, though the embodiment is described above 
by using a picture, it is possible to apply each of the above 
methods to audio instead of video. This corresponds to 
claims 85 and 87, 
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Furthermore, in the case of the third embodiment , an 
actual execution frequency of halfpixel prediction is 
recorded in step 3034 to calculate a maximum execution 
frequency. Then, when the maximum value is a small-enough 
value or less (e.g. 2 or 3), it is possible to generate and 
transmit a data string showing that halfpixel prediction 
is not used (data string comprising a specific bit pattern) . 
Furthermore, in the case of the first embodiment, when 
receiving a data string showing that halfpixel prediction 
is not used, it is possible to make the movement compensation 
processing in step 1022 always serve as fullpixel 
prediction. This corresponds to claims 94 and 92. 

Furthermore, the above concept can be applied to cases 
other than movement compensation. For example, by using 
no high-frequency component for DCT calculation, it is 
possible to reduce the DCT calculation processing time. 
That is, in the case of a receiving method, when the rate 
of IDCT-calculation execution time to the entire execution 
time exceeds a certain value, a data string showing that 
the rate exceeds a certain value is transmitted to the 
transmitting side. 

When the transmitting side receives the data string, 
it is possible to calculate only low-frequency components 
through the DCT calculation and reduce all high-frequency 
components to zero. This corresponds to claim 90. 

Furthermore, though the embodiment is described above 
by using a picture, it is also possible to apply the above 
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method to audio instead of picture. This corresponds to 
claims 86 and 88 . 

As described above , according to claims 68 and 74 (e.g. 
first and third embodiments ) , the execution time of 
decoding is estimated in accordance with the estimated 
execution time of each element and, when the estimated 
decoding execution time may exceed the time (designated 
time) required to receive the data for one sheet, half pixel 
prediction having a long execution time is replaced with 
fullpixel prediction. Thereby , it is possible to prevent 
the execution time from exceeding the designated time and 
solve the problem (CI). 

Furthermore , according to claims 75 and 77 (e.g. fifth 
and seventh embodiments ) , the estimated execution time of 
each element is transmitted to the transmitting side, the 
execution time of decoding is estimated at the transmitting 
side, and half pixel prediction having a long execution time 
is replaced with fullpixel prediction so that the estimated 
decoding time does not exceed the time (designated time) 
probably required to receive the data for one sheet. 
Thereby, the information for half pixel prediction in the 
sent encoded information is not disused and it is possible 
to prevent the execution time from exceeding the designated 
time and solve the problem (C2). 

Furthermore, according to claim 79 (e.g. ninth 
embodiment), it is possible to solve the problem (C3) by 
estimating the execution time of each processing, moreover 
estimating the ex cution time required for encoding in 
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accordance with the estimated execution time, and deciding 
an executing frequency so that the estimated encoding time 
becomes equal to or less than the time usable for encoding 
of a picture decided in accordance with a frame rate. 

Thus, the present invention makes it possible to 
realize a function (CGD: Computational Graceful 
Degradation) for slowly degrading quality even if a 
calculated load increases and thereby, a very large 
advantage can be obtained. 

Moreover, it is possible to perform operations same 
as described above by a computer by using a recording medium 
such as a magnetic recording medium or optical recording 
medium in which a program for making the computer execute 
all or part (or operations of each means) of the each steps 
(or each means) described in any one of the above-described 
embodiments . 

Industrial Applicability 

As described above, the present invention makes it 
possible to change information frames correspondingly to 
the situation, purpose, or transmission line by dynamically 
deciding the frames of data control information, 
transmission control information, and control information 
used for transmitting and receiving terminals. Moreover, 
it is easy to handle a plurality of video streams or a 
plurality of audio streams and mainly reproducing an 
important scene cut synchronously with audio by reflecting 
the intention of an editor. Furthermore, it is possible 
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to prevent an execution time from exceeding a designated 
time by estimating the execution time of decoding in 
accordance with the execution time of each estimated 
element and replacing halfpixel prediction having a long 
execution time with f ullpixel prediction when the estimated 
decoding execution time may exceed the time (designated 
time) required to receive the data for one sheet. 



